SAMurai is a machine learning model that is used to classify government contracts from SAM.gov based on their suitability for a specific company.
Users can provide suitable government contracts in a text file, and automatically create a dataset for the model to use.
The model is a Naive Bayes model that takes into account the categorical features of the dataset and the multinomial features. Take a look at docs/model.md for more details on the implementation.
SAMurai uses poetry to manage dependencies. Install poetry using the package manger of your choice, then run the following command inside of the project repo.
poetry install
Follow the instructions in docs/dataset.md to create a dataset that can be used to train the model.
After creating the dataset, run the following command to train and run testing on your model:
python3 sam_classifier/pipeline/main.pyInstructions for automating the pipeline of sending SAM.gov API results through the model can be found at docs/inference.md
- Build a neural network for classification
- CLI that can automate generating the dataset
