First, install the requirements in the environment, then navigate tosrc/data/make_dataset.py
.
python3 make_dataset.py
The interim dataset would be stored in the data/interim
directory in the root folder, which would be used for training the model, from this dataset itself test set would be split with a seed of 42
.
First, install the requirements in the environment, then navigate tosrc/models/train_model.py
.
python3 train_model.py
The trained checkpoints will be stored in the models/bloom-detoxification/checkpoint-800/
of the root folder, the repository already has trained checpoints, if its
trained again the checkpoints will be updated.
Navigate tosrc/models/predict_model.py
python3 predict_model.py
You'll be asked for input of the toxic text, and the output would be both the toxic text and the corresponding result.