🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.
-
Updated
Apr 20, 2025 - Python
🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.
Online Deep Learning: Learning Deep Neural Networks on the Fly / Non-linear Contextual Bandit Algorithm (ONN_THS)
👤 Multi-Armed Bandit Algorithms Library (MAB) 👮
All codes, both created and optimized for best results from the SuperDataScience Course
In This repository I made some simple to complex methods in machine learning. Here I try to build template style code.
Bandit algorithms
Offline evaluation of multi-armed bandit algorithms
Bayesian Optimization for Categorical and Continuous Inputs
Implementations of basic concepts dealt under the Reinforcement Learning umbrella. This project is collection of assignments in CS747: Foundations of Intelligent and Learning Agents (Autumn 2017) at IIT Bombay
Author's implementation of the paper Correlated Age-of-Information Bandits.
Thompson Sampling for Bandits using UCB policy
Repository Containing Comparison of two methods for dealing with Exploration-Exploitation dilemma for MultiArmed Bandits
Different implementations of Bayesian neural networks for uncertainty estimation. The uncertainty estimation is utilized for efficient exploration in reinforcement learning.
Codes and templates for ML algorithms created, modified and optimized in Python and R.
An improved version of Turbo algorithm for the Black-box optimization competition organized by NeurIPS 2020
A multi-armed bandit (MAB) simulation library in Python
The official code release for "Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning", ICLR 2025
TSRoots: A Python package for efficient Gaussian process Thompson sampling in Bayesian optimization via rootfinding.
Foundations Of Intelligent Learning Agents (FILA) Assignments
Add a description, image, and links to the thompson-sampling topic page so that developers can more easily learn about it.
To associate your repository with the thompson-sampling topic, visit your repo's landing page and select "manage topics."