Inference

Model inference is the process of using a trained machine learning model to make predictions or decisions on new, unseen data.

Model inference, also known as prediction or inference, refers to the process of using a trained machine learning model to make predictions or decisions on new, unseen data. During inference, the model applies the knowledge it acquired during training to make predictions or classifications on input data that it hasn't seen before.

Here's an overview of the model inference process:

Input Data: New, unseen data is provided to the trained model for prediction. This data can take various forms depending on the nature of the machine learning task, such as images, text, numerical features, or time-series data.
Prediction: The model processes the input data through its learned parameters and architecture to generate predictions or classifications. For example, in a classification task, the model assigns a class label or probability distribution to each input example based on its features.
Output: The model produces output predictions, which can take different forms depending on the specific task:
- Classification: For classification tasks, the model outputs class labels or probability distributions over multiple classes, indicating the likelihood of each class.
- Regression: For regression tasks, the model outputs continuous numerical values, representing predictions of a target variable.
Decision Making: In some cases, the model's predictions may be used directly to make decisions or take actions. For example, in a medical diagnosis system, the model's predictions may guide treatment decisions based on the likelihood of different diseases.
Feedback Loop: In real-world applications, the model's predictions may be used to inform further actions or decisions. For example, feedback from users or downstream systems may be used to continuously improve the model's performance over time.

Example from our saved model

import tensorflow as tf
from tensorflow.keras.preprocessing.text import tokenizer_from_json
from tensorflow.keras.preprocessing.sequence import pad_sequences


import re
import joblib
import json

# load encoder
encoder = joblib.load('encoder/encoder.joblib')

# load tokenizer
with open('tokenizer/tf_tokenizer.json', 'r') as f:
    tokenizer_json = json.load(f)

tokenizer = tokenizer_from_json(tokenizer_json)


# load model
model = tf.keras.models.load_model('models/starter_swahili_news_classification_model.h5')

# normalize text
max_words = 1000  # fetched from tokenizer
max_len = 200      # fetched from tokenizer

def normalize_text(text):
    # Remove punctuation, numbers, and special characters
    text = re.sub(r'[^a-zA-Z\s]', '', text)
    # Convert to lowercase
    text = text.lower()
    return text

def pre_process(tokenizer, max_len, input_text):
    input_text = normalize_text(input_text)
    input_sequence = tokenizer.texts_to_sequences([input_text])
    input_data = pad_sequences(input_sequence, maxlen=max_len)
    return input_data


def classify_news(input_text):
    input_data = pre_process(tokenizer, max_len, input_text)
    pred = model.predict(input_data)
    # for each input sample, the model returns a vector of probabilities
    # return all classes with their corresponding probabilities
    result_dict = {}

    for i, category in enumerate(encoder.categories_[0]):
        result_dict[category] = str(round(pred[0][i] * 100, 2))+'%'

    highest_prob = max(result_dict, key=result_dict.get)

    return (result_dict, highest_prob)

This Python code snippet sets up a function named classify_news for performing inference using a pre-trained text classification model. Let's break down the key components and their functionalities:

Importing Libraries:
- The code imports necessary libraries from TensorFlow and other packages required for text preprocessing and model loading.
Loading Preprocessing Artifacts:
- The encoder object is loaded using joblib.load() from the file path 'encoder/encoder.joblib'. This object is used to encode categorical labels.
- The tokenizer object is loaded from a JSON file using tokenizer_from_json(). The JSON file contains tokenizer configuration saved during training.
Loading the Trained Model:
- The pre-trained model is loaded using tf.keras.models.load_model() from the file path 'models/starter_swahili_news_classification_model.h5'. This model has been previously trained for text classification.
Text Normalization:
- The normalize_text() function removes punctuation, numbers, and special characters from the input text and converts it to lowercase. This ensures consistency in text processing.
Preprocessing Input Text:
- The pre_process() function preprocesses the input text before feeding it to the model for prediction. It normalizes the text using normalize_text() and tokenizes it using the loaded tokenizer. The text is then converted to a sequence of integers and padded to a fixed length using pad_sequences().
Model Inference:
- The classify_news() function takes an input text and preprocesses it using pre_process(). It then uses the pre-trained model to predict the probabilities of each class label for the input text.
- The probabilities are converted into a dictionary (result_dict) where each key represents a class label, and the corresponding value represents the predicted probability.
- The class label with the highest probability (highest_prob) is determined using max() based on the probabilities in result_dict.
Output:
- The function returns a tuple containing result_dict (probabilities for each class) and highest_prob (the predicted class label with the highest probability).

Live App at: https://swahili-headline.streamlit.app/

PreviousEvaluation NextFine-tuning and Optimization

Last updated 1 year ago