I’m a Software Engineer at PayPal, crafting intelligent, scalable, and privacy-first AI systems — from idea to production.
I engineer GenAI systems that understand your data, automate decision-making, and stay compliant — across development, deployment, and optimization.
- 🧠 GenAI & LLM Systems: Fine-tune and deploy custom pipelines with LLaMA, GPT-4o, Mistral; feedback loops, LoRA, quantization & prompt engineering.
- ⚙️ AI Automation & Bots: Automate real-time decision systems like database QnA bots, regulatory insight engines, and context-aware assistants.
- 🔎 RAG (Retrieval-Augmented Generation): Advanced vectorDB RAG using FAISS/Pinecone, prompt decomposition, caching, and ranking strategies.
- 🛠️ Full-Stack AI Engineering: I do it all — from APIs to Docker to cloud. I create and serve ML APIs, build Docker containers, deploy on cloud, and own MLOps pipelines end-to-end.
Languages:
Python | TypeScript | SQL | Bash
Backend & APIs:
FastAPI | Flask | gRPC | Gunicorn | Nginx
LLMs & GenAI:
LLaMA | Mistral | GPT-4o | LoRA | Transformers | Prompt Engineering | LangChain | Custom Fine-tuning
Model Optimization:
Quantization | Memory Offloading | LoRA | TorchServe | ONNX
Retrieval & DBs:
FAISS | Pinecone | PostgreSQL | Dynamic SQL | Chroma | VectorDBs
DevOps / MLOps:
Docker | Kubernetes | AWS | Lambda | S3 | EC2 | SageMaker | CloudWatch
MLflow | Weights & Biases | Airflow | GitHub Actions | Shell Scripting
Infra Tools:
Redis | Celery | Kafka | ElasticSearch
- 🧠 Hallucination detection + explainability in RAG/LLMs
- 🧪 Real-time data agents + multi-hop question answering
- ⚡ Scaling inference on single-GPU + multi-tenant workloads
- 🔍 Evaluation frameworks like RAGAS, TruLens, Promptfoo
"I don’t just deploy models — I deploy intelligence."
— vhx


