π AI Engineer | Finance & Investment Enthusiast | NLP, Machine Learning, Generative AI
I'm an AI Engineer specializing in Machine Learning, NLP, Computer Vision, and Generative AI. Currently working as an AI Intern at Tata Elxsi, I focus on video & audio analysis using AI, automated expressive dubbing with multiple approaches, and cutting-edge AI research.
- Programming Languages: Python, MATLAB
- Databases: MySQL
- Data Analysis & Visualization: Pandas, Excel, Tableau
- Machine Learning & AI: Scikit-learn, TensorFlow, PyTorch
- Generative AI: LangChain, Prompt Engineering, Model Fine-Tuning, Hugging Face Transformers, RAG, LLM API Integration
- Computer Vision & Speech Processing: YOLO, Vision Transformers, Librosa, Advanced Lip-Sync Algorithms
- Big Data & Finance AI: Apache Spark, Market Analysis, Algorithmic Trading
- Agentic AI & Model Deployment: Continuous experimentation with fine-tuning, RAG, autonomous AI systems, and multi-modal AI applications.
-
π₯ AI for Video & Audio Processing: Working on automated expressive dubbing, integrating deep learning-based lip-syncing and voice cloning to enhance dubbing realism. Focused on multi-lingual expressive dubbing using generative models, reducing manual effort while improving accuracy.
-
π₯ Healthcare AI: Developed deep learning models for medical imaging & disease detection. Created dental disease detection systems using YOLOv11 & Faster R-CNN, achieving high accuracy in identifying oral health issues. Worked on gingivitis detection & image captioning, combining Vision Transformer (ViT) and GPT models for automated medical reports.
-
π° Finance & Investment AI: Regularly analyzing financial markets and leveraging AI for stock trend analysis, FMCG market research, and algorithmic trading. Worked on Nifty FMCG index analysis, optimizing data pipelines for real-time insights. Experimenting with LLMs for financial data interpretation and predictive analytics.
-
πΌ Computer Vision & Multi-Modal AI: Implementing vision-based models for real-world applications. Worked on Nepali Sign Language recognition using YOLOv8, achieving state-of-the-art accuracy for accessibility solutions. Exploring multi-modal AI that integrates text, image, and video understanding.
-
π€ Generative AI & Autonomous Systems: Actively testing & deploying fine-tuned LLMs, RAG-based retrieval, and AI agents for various applications. Working on Agentic AI systems to automate workflow decision-making using multi-modal inputs and adaptive LLM reasoning.
-
Offensive Text Detection: Exploring Traditional, Ensemble Models, and KAN in Tamil-English Text - Published in Proceedings of the 9th International Conference on Data Management, Analytics & Innovation (ICDMAI 2025).
-
Deep Learning Approach for Nepal Sign Detection - Published in Proceedings of the 13th International Conference on Computational Intelligence and Communication Systems (ICCIS 2024).
-
Published AI research in computer vision, NLP, and multi-modal AI.
-
Contributed to open-source AI projects and explored advancements in agentic AI & fine-tuning methodologies.
- Email: jaidev4103@gmail.com
- LinkedIn: linkedin.com/in/jaidevkumar
- GitHub: github.com/jaidevk04
π‘ I'm open to collaborations, research opportunities, and open-source contributions. Let's build something amazing! π