Text Representation
Representing Classes / Labels into numerical format
texts = df['Text'].values
labels = df['category'].valuesfrom sklearn.preprocessing import OneHotEncoder
# one hot encode the labels
encoder = OneHotEncoder(sparse=False)
labels = encoder.fit_transform(labels.reshape(-1, 1))# save the encoder
import joblib
joblib.dump(encoder, 'encoder/encoder.joblib')Create Train Test Splits from dataset
Last updated