Rapid Evaluation Framework for CMIP simulations
-
Updated
Nov 6, 2025 - Python
Rapid Evaluation Framework for CMIP simulations
Open source code for AIOpsServing
Machine Learning Model using Decision Trees on US Voting Dataset
Predict stock prices using Linear Regression and LSTM models. Includes data preprocessing, visualization, and benchmarking tools for analyzing historical stock data.
A modular deep learning evaluation framework for benchmarking multiple CNN architectures across varied optimization strategies and training configurations. Built for scalable experimentation and transferability to real-world image classification tasks.
This repo contains a study on performance of LLMs on STS(Semantic Textual Similarity) Data.
Interactive Python toolkit for exploring, testing, and benchmarking LLM tokenization, prompt behaviors, and sequence efficiency in a safe, modular sandbox environment.
An in-depth comparison and building walkthrough of two sentiment classification models, ML vs. DL —a Logistic Regression and an LSTM model— both trained on the Sentiment140 dataset. The Jupyter notebook contains every step from data preprocessing, construction, training, evaluation, and comparing the strengths and shortcomings of each approach.
Deep learning computer vision workflow covering EDA, model benchmarking, fine-tuning, and MLOps integration with Docker, MLflow, and CI/CD.
A Streamlit web app that uses a Groq-powered LLM (Llama 3) to act as an impartial judge for evaluating and comparing two model outputs. Supports custom criteria, presets like creativity and brand tone, and returns structured scores, explanations, and a winner. Built end-to-end with Python, Groq API, and Streamlit.
Add a description, image, and links to the model-benchmarking topic page so that developers can more easily learn about it.
To associate your repository with the model-benchmarking topic, visit your repo's landing page and select "manage topics."