Chicago, IL β’ 4+ Years Experience β’ End-to-End AI/ML Lifecycle Ownership
High-agency Machine Learning Engineer specializing in LLM Fine-Tuning, Distributed Training (FSDP, DeepSpeed), Inference Optimization, and building production-grade AI systems on AWS/Kubernetes.
Since much of my work is proprietary, here is an overview of the production systems I have architected and deployed in Healthcare and Enterprise domains:
- Fair GPU Compute Scheduler: Engineered workload orchestration on Amazon EKS with Apache YuniKorn, implementing gang scheduling and backfill algorithms to bin-pack heterogeneous multi-GPU jobs while enforcing starvation-proof priority queues.
- Hybrid Capacity Strategy: Designed a cost-optimized GPU strategy maximizing Reserved Instance utilization for steady workloads while using Karpenter to autoscale Spot capacity for bursts.
- Developer Tooling: Built an internal Python CLI and GitOps abstraction layer that replaced complex Kubernetes YAML for 10 research teams, standardizing distributed training and checkpointing workflows.
- 70B+ Model Fine-Tuning: Engineered a distributed training pipeline using PyTorch FSDP to fine-tune 70B+ parameter models (Llama-3, BioMistral, Med-42, Gemma-2) on multi-node GPU clusters, leveraging LoRA/QLoRA and Quantization-Aware Training (QAT).
- Custom FlashAttention-2 Adapters: Developed custom adapter classes to enable efficient fine-tuning of proprietary model architectures that lacked native support, increasing training throughput by 3x.
- SOAP Notes Generation: Benchmarked proprietary APIs (GPT-4o, Claude Sonnet 3.5, Gemini 1.5 Pro) versus open-source LLMs for therapist-patient conversation summarization into structured clinical documentation.
- Production ASR Pipeline: Fine-tuned multiple Whisper model variants on custom datasets, optimizing WER vs. latency tradeoff and deployed on AWS SageMaker with auto-scaling and CloudWatch monitoring.
- FleetMind: Architected a production-ready agentic workflow using LangGraph with Planner-Specialist-Critic pattern, enabling natural language queries over complex vehicle telemetry data.
- Knowledge Graph Integration: Engineered a Neo4j knowledge graph to enable multi-hop relational queries that flat data structures could not support.
- Self-Correcting Systems: Implemented custom CritiqueAgent loops to significantly improve factual accuracy of LLM-generated answers.
- Revenue Impact ML: Trained and deployed ML models using Apache Spark on Databricks, processing terabytes of clickstream data to identify highly engaged visitors, contributing to a 20% revenue increase within 12 months.
- MLOps Lifecycle: Implemented end-to-end MLOps using MLflow for experiment tracking and model registry, reducing deployment time from weeks to days while ensuring reproducibility.
- A/B Testing & Optimization: Designed and executed A/B tests with Adobe Target, boosting click-through rate by 10% and conversion rate by 5%.
- Master of Science in Computer Science | University of Illinois Chicago, IL
- AWS DataBricks Platform Architect | View Credential
- AWS Cloud Practitioner | View Credential