Ahmed Shmels ahmecse

👋 Hi, I'm Ahmed

🎓 M.Tech DS @ IIT Madras | B.Tech CS @ VIT | 🔬 AI Researcher

Building the Future with AI

About Me

I’m an AI researcher and data scientist with an M.Tech in Data Science from IIT Madras and a B.Tech in Computer Science from VIT. My core focus lies at the intersection of Reinforcement Learning (RL), Large Language Models (LLMs), and Deep Learning, where I strive to push the boundaries of AI capabilities.

Research Interests:

Reinforcement Learning for LLMs (RL4LLMs)
Multi-Agent Reinforcement Learning (MARL)
Multi-Agent Reasoning (MAR)
Communication in Multi-Agent Systems (MAS-Comm)
Causality in Reinforcement Learning (Causality & RL)
Representation Learning for Reinforcement Learning (RepL4RL)

I’m seeking a PhD or research position where I can drive innovative AI projects alongside a collaborative team.

🛠 Tech Stack

👨‍💻 Programming Languages

🧠 DL, ML & NLP

🌐 Web Dev

📊 Data Science & Visualization

🗃️ Databases & Big Data

🧪 Experiment Tracking & ML Tooling

☁️ Cloud, DevOps & Version Control

Featured Projects

Thesis

M.Tech Thesis: DNF-Net: A DL Approach for Advancing Breast Cancer Detection in Histopathology Images. (Poster / PPT)
- Built a magnification-invariant hybrid model that synergizes fuzzy logic—to explicitly handle diagnostic uncertainty (fuzziness)—with deep-learning backbones (Xception, InceptionV3, DenseNet-169) for advanced hierarchical feature extraction, yielding a 5% accuracy gain over SOTA on BreakHis and BACH histopathology datasets—robustly validated at 40×, 100×, 200×, and 400× magnifications and across 2-/4-/8-class tasks.
- Keywords: deep-learning; fuzzy-logic; magnification-invariance; medical-image-analysis; histopathology; image-classification
B.Tech Thesis: CXRcovNet: COVID‑19 detection from CXR images using transfer learning approaches. (Repo / PPT)
- Applied Transfer Learning techniques using pre-trained CNN models to classify COVID-19 from Chest X-Ray (CXR) images.
- Keywords: computer-vision, deep-learning, transfer-learning, covid-19, cxr, image-classification

RL (Reinforcement Learning)

Reinforcement Fine-Tuning LLMs with GRPO (Repo)
- Investigated the efficacy of GRPO for RFT of LLMs, adapting models for complex reasoning and strategic tasks (demonstrated via a Wordle-style game with Qwen 2.5 7B).
- Tech Stack: Python, PyTorch, RL, LLMs, GRPO
- Keywords: rlft, grpo, llms, reinforcement-learning, fine-tuning, Reward functions, Reward hacking, Calculating loss in GRPO
Hierarchical Reinforcement Learning (IITM CS6700 PA3) (Repo)
- Implemented and evaluated Hierarchical RL techniques (SMDP Q-Learning, Intra-Option Q-Learning) in the Taxi-v3 environment, analyzing the impact of option design on learning efficiency and policy structure.
- Tech Stack: Python, RL (Hierarchical RL, Q-Learning), OpenAI Gym
- Keywords: hierarchical-rl, smdp, intra-option-q-learning, reinforcement-learning, taxi-v3
Dueling-DQN & Monte Carlo REINFORCE (IITM CS6700 PA2) (Repo)
- Implemented and compared Dueling-DQN (Type-1 vs Type-2) and Monte Carlo REINFORCE (with/without baseline) algorithms on Acrobot-v1 and CartPole-v1 environments.
- Tech Stack: Python, PyTorch, RL (DQN, Policy Gradient), OpenAI Gym
- Keywords: dueling-dqn, reinforce, baseline, deep-reinforcement-learning, acrobot-v1, cartpole-v1
Temporal Difference Learning (SARSA & Q-Learning) (IITM CS6700 PA1) (Repo)
- Implemented and compared TD algorithms (SARSA and Q-Learning) in a custom 10x10 Grid World with stochastic transitions and wind effects, building a strong base in core RL concepts.
- Tech Stack: Python, RL (TD Learning, Q-Learning, SARSA), NumPy, Matplotlib
- Keywords: Temporal Difference, SARSA, Q-Learning, Gridworld, Reinforcement Learning, Stochastic Environments

DL (Deep Learning)

Feedforward Neural Networks (FNN) from Scratch (IITM CS6910 PA1) (Repo / W&B Report)
- Built an end-to-end NumPy-only FNN for Fashion-MNIST classification, integrating six optimizers (SGD, Momentum, NAG, RMSProp, Adam, Nadam), four activations (sigmoid, tanh, ReLU, softmax), two losses (MSE, Cross-Entropy), weight initialization (Xavier, random), regularization (L1, L2), early stopping, and W&B-driven hyperparameter sweeps.
- Tech Stack: Python, NumPy, Matplotlib, Seaborn, Scikit-learn, Weights & Biases
- Keywords: feedforward-NN, backpropagation, optimizers, activation-functions, initialization, regularization, hyperparameter-tuning
Convolutional Neural Networks (CNN) (IITM CS6910 PA2) (Repo / W&B Report)
- A two-fold project—(i) trained a CNN from scratch in PyTorch with Bayesian hyperparameter optimization via W&B sweeps (tuning filters, kernel sizes, batch norm, dropout, augmentation), including filter visualization and guided backpropagation for interpretability, and (ii) fine-tuned a pre-trained CNN model for performance benchmarking and comparison.
- Tech Stack: Python, PyTorch, OpenCV, Weights & Biases
- Keywords: CNN, Hyperparameter Optimization, Bayesian Optimization, Data Augmentation, Filter Visualization, Guided Backpropagation, Interpretability, W&B
Sequence-to-Sequence Learning (RNN) (IITM CS6910 PA3) (Repo / W&B Report)
- Developed and evaluated sequence-to-sequence models (vanilla RNN, LSTM, GRU) with and without attention mechanisms for English-to-Malayalam transliteration (Aksharantar Dataset), analyzing the impact of architectural choices and attention on translation quality.
- Tech Stack: Python, PyTorch, Weights & Biases
- Keywords: Seq2Seq, Attention Mechanisms, RNN, LSTM, GRU, Transliteration, Encoder-Decoder, Attention Heatmaps, NLP

NLP (Natural Language Processing)

Advanced Information Retrieval System (IITM CS6370) (Repo / Report)
- Built a hybrid search engine combining TF–IDF VSM, LSA, and a BERT-based reranker for top-k retrieval, with end-to-end evaluation (Precision@k, MAP, nDCG) on the Cranfield and Brown corpora.
- Tech Stack: Python, Scikit-learn, Gensim, PyTorch, Transformers
- Keywords: Information Retrieval, TF–IDF, LSA, ESA, Word2Vec, BERT Reranking, Evaluation Metrics, NLP, Semantic Search

ML (Machine Learning Fundamentals/Theory)

Mathematical Essays on Core ML Algorithms
- Authored a series of mathematical essays (formatted in IEEE style using LaTeX) dissecting the theoretical underpinnings, derivations, and applications of fundamental ML algorithms:
- Tech Stack: LaTeX, Python (for supporting visualizations/analysis)
- Keywords: Ml Theory, Math Foundations, Linear Regression, Logistic Regression, Decision Trees, Random Forest, Naive Bayes, SVM, LaTeX

Publications

Beyond the Horizon: Exploring the Impact of AI on Early Cancer Detection & Diagnosis — A Comprehensive Review
- Journal: Computers in Biology and Medicine (Impact Factor: 7.7)
- Submission Date: January 2025
- Manuscript ID: CIBM-D-25-00543
- Status: Under Review

Certificates & Continuous Learning

Certificate/Specialization	Provider	Date Completed	Link ID
Advanced Large Language Model Agents	UC Berkeley	May 2025	Soon, May 31, 2025
Linguistic Linked Data – Advanced Topics	German UDS Academy	May 2025	View Certificate
Linguistic Linked Data – Essentials	German UDS Academy	Apr 2025	View Certificate
Natural Language Processing	Udemy, Inc.	Aug 2023	View Certificate
The Complete Python Bootcamp	Udemy, Inc.	Aug 2023	View Certificate
Mathematics for ML & DS Specialization	DeepLearning.AI	Jun 2023	View Certificate
Machine Learning Specialization	DeepLearning.AI	Jan 2023	View Certificate
Google Data Analytics Specialization	Google	Apr 2022	View Certificate

Provide feedback

Saved searches

Use saved searches to filter your results more quickly