Portfolio Note: Portfolio recreation of NLP pipeline built at Omfys Technologies.
Fine-tuned BERT on 100K+ reviews achieving 93% F1-score with PyTorch, ONNX optimization, and Streamlit dashboard achieving 87% satisfaction improvement and 22% reduction in negative reviews.
- Dataset: 100K+ customer reviews
- F1-Score: 93%
- Classes: 5 (Very Negative to Very Positive)
- Inference: <100ms latency
- Business Impact: 87% satisfaction improvement
- DL Framework: PyTorch, Transformers (Hugging Face)
- Model: BERT-base-uncased
- Optimization: ONNX Runtime (3x speedup)
- Dashboard: Streamlit
- API: FastAPI
- Topic Modeling: Gensim (LDA)
- Cloud: OCI Data Science (A10 GPUs)
- Transfer learning from BERT-base-uncased
- Mixed-precision training (FP16) on A10 GPUs
- Gradient accumulation
- Weighted loss for class imbalance
- Training time reduced by 50%
- Back-translation
- Synonym replacement
- Random insertion/deletion
- 40% increase in training data
- Model quantization
- 3x inference speedup
- Reduced model size by 4x
- Deployment-ready format
- Real-time prediction (<100ms)
- Confidence scores
- Attention visualization with BertViz
- Batch processing (10K reviews/hour)
- LDA topic modeling
sentiment-analysis-bert/
├── src/
│ ├── training/ # Fine-tuning scripts
│ ├── inference/ # ONNX inference
│ ├── dashboard/ # Streamlit app
│ └── preprocessing/ # Text processing
├── notebooks/ # Jupyter notebooks
├── Dockerfile
└── README.md
git clone https://github.com/Amanroy666/sentiment-analysis-bert.git
cd sentiment-analysis-bert
pip install -r requirements.txt
# Run training
python src/training/train_bert.py
# Start dashboard
streamlit run src/dashboard/streamlit_app.py| Class | Precision | Recall | F1-Score |
|---|---|---|---|
| Very Negative | 91% | 89% | 90% |
| Negative | 92% | 91% | 91.5% |
| Neutral | 94% | 95% | 94.5% |
| Positive | 93% | 94% | 93.5% |
| Very Positive | 95% | 96% | 95.5% |
| Weighted Avg | 93% | 93% | 93% |
Aman Roy - Data Engineer at Omfys Technologies
📧 contactaman000@gmail.com | 💼 LinkedIn