MS in Data Science @ UW–Madison
Experience: AI/ML Engineer (Research) | Data Analyst (Deloitte)
Based in Mountain View, CA
I’m passionate about designing scalable AI and data systems that bring research into real-world impact.
My work spans machine learning algorithms, deep neural networks, and statistical analysis, with a strong focus on end-to-end data pipelines and applied AI systems.
I specialize in efficient training of ML models using Active Learning, smart data selection, and robust evaluation to reduce compute and labeling costs while preserving accuracy.
I have hands-on experience with LLMs, RAG pipelines, vector databases, and big-data engineering, and I deploy solutions across ETL, analytics dashboards, and cloud integrations.
Alongside research, I actively contribute to open-source projects and collaborate on advancing efficient ML methods.
Languages: Python, SQL, R, C++, MATLAB, Java, JavaScript
Frameworks & Libraries: PyTorch, TensorFlow, Scikit-Learn, Pandas, NumPy, Matplotlib, NLTK
Big Data & Cloud: Spark, Kafka, Hive, BigQuery, GCP, Oracle ERP, Docker, CI/CD
Tools & Platforms: Tableau, Power BI, Git, Weights & Biases, LangChain, FastAPI, Flask
Specialties:
- Machine Learning Algorithms & Deep Neural Networks
- Active Learning & Efficient Fine-Tuning
- Statistical Modeling, Hypothesis Testing & A/B Testing
- LLMs, RAG Systems & Vector Databases
- Data Engineering, ETL Pipelines & Cloud Integration
- Big Data Systems, Streaming Pipelines & Real-Time Analytics
- Agentic RAG for Radiology → Built with LangGraph + LLaMA, deployed into clinical workflows
- Efficient LLM Fine-Tuning → Novel data selection strategy; reduced training data by 67% while maintaining accuracy
- SQL Injection Detection → ML-based real-time query classification (published at IEEE CONECCT)
- Big Data Systems → Spark + Kafka + Hive pipeline for large-scale loan prediction and real-time analytics
- Hope Speech Detection → Developed NLP pipeline for multilingual social media moderation (published book chapter)
- Image Classification → U-Net based model for multicultural wedding classification (published in Expert Systems with Applications)
📄 Publications in EMNLP (under review), Expert Systems with Applications, IEEE CONECCT, CCSM Book Chapter
- Exploring hybrid RAG systems combining vector databases with knowledge graphs
- Advancing research in Active Learning, Deep Neural Networks, and Statistical Modeling
- Contributing to open-source AI/ML and data engineering frameworks
- Investigating deployment strategies for LLMs in real-world applications
- Building scalable pipelines that merge big data systems with applied machine learning
⭐️ Always open to collaboration on AI and ML projects!