Welcome to my GitHub portfolio! I am an experienced Data Scientist with a proven track record in designing, developing, and deploying AI-powered systems that create tangible business value. My core expertise spans across Machine Learning, Deep Learning, Generative AI, and MLOps, with hands-on experience in building scalable systems using modern AI stacks and cloud services.
- Experienced in delivering end-to-end AI solutionsβfrom data exploration to deployment.
- Specialized in LLM-based applications, RAG pipelines, and NLP-driven contract analysis.
- Adept at building systems with 100+ concurrent users, microservice architecture, and GPU-accelerated LLM deployments.
- Proven ability to collaborate with global teams and contribute to enterprise-grade AI projects.
- Retrieval-Augmented Generation (RAG), LangChain, Prompt Engineering
- Vector Databases, Transformers, LLMs (OpenAI, LLama3, Gemma, Whisper)
- Scikit-learn, Pandas, Numpy, Regression, Classification, Ensemble Methods
- Boosting & Bagging (XGBoost, LightGBM, CatBoost), Unsupervised Learning
- PyTorch, LSTM, CNN, RNN, Seq2Seq, BERT-based models, Token Classification, NER
- ML Pipelines, DVC, MLflow, CI/CD, Docker, Model Deployment on GPU Servers
- Azure, AWS, Azure AI Services (OpenAI, AI Search, Form Recognizer)
- FastAPI, Flask, REST APIs, Microservice Architecture, GPU Containers
- Built an Azure-based RAG system integrating:
- Azure Form Recognizer for document parsing
- Azure OpenAI for embedding and summarization
- Azure AI Search with hybrid semantic retrieval using custom vector fields
- Deployed as a highly concurrent production system (100+ users) via robust API endpoints.
- Developed language detection and transcription using OpenAI Whisper.
- Migrated legacy systems from monolithic to microservices.
- Deployed GPU-enabled containers with offline LLM inference on client servers.
- Fine-tuned BERT for multilabel clause classification in legal contracts.
- Created a legal document summarizer using LLMs with performance evaluation.
- Trained custom NER models to extract legal entities like parties, addresses, and dates.
- Deployed models on GPU servers for high-throughput inference.
- Performed extensive data cleaning, feature engineering, and dimensionality reduction.
- Built machine learning pipelines to score loan propensity for customers.
- Created REST API for serving ML predictions using scikit-learn and FastAPI.
- Post Graduate Program in Data Science & Analytics β Advanced ML Track
Imarticus Learning Pvt. Ltd. (2022 β 2023) | Grade: A - B.Tech in Engineering
MIT Academy of Engineering, Pune (2016 β 2020) | CGPA: 7.41
- Certified Data Scientist & Analyst β NSDC & Skill India
- Received multiple performance awards at Shyena Tech Yarns and VCreatek Consultancy
- Regular contributor to innovative internal AI initiatives and PoC development
- π LinkedIn
- π§ GitHub
- π§ avinashmyerolkar@gmail.com
Feel free to explore the repositories, clone projects, or connect with me for collaboration or feedback. Thanks for visiting!