thompson-sampling

Here are 52 public repositories matching this topic...

sail-sg / oat

🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.

thompson-sampling alignment reasoning distributed-training ppo dueling-bandits dpo distributed-rl llm online-rl rlhf llm-aligment online-alignment llm-exploration grpo r1-zero

Updated Apr 20, 2025
Python

alison-carrera / onn

Star

Online Deep Learning: Learning Deep Neural Networks on the Fly / Non-linear Contextual Bandit Algorithm (ONN_THS)

reinforcement-learning neural-network pytorch thompson-sampling reinforcement-learning-algorithms machine-learning-library neural-architecture-search contextual-bandits mab pytorch-implemention multiarmed-bandits pytorch-implementation thompson-algorithm

Updated Dec 11, 2019
Python

alison-carrera / mabalgs

Star

👤 Multi-Armed Bandit Algorithms Library (MAB) 👮

arm algorithm reinforcement-learning simulation monte-carlo rank thompson-sampling reinforcement-learning-algorithms ucb reward multi-armed-bandit montecarlo-simulation contextual-bandits ranking-algorithm mab ranked-mab

Updated Sep 6, 2022
Python

farhanchoudhary / Machine_Learning_A-Z_All_Codes_and_Templates

Star

All codes, both created and optimized for best results from the SuperDataScience Course

natural-language-processing reinforcement-learning deep-learning clustering cross-validation naive-bayes-classifier thompson-sampling neural-networks classification dimensionality-reduction grid-search principal-component-analysis clustering-algorithm upper-confidence-bounds k-fold xgboost-algorithm association-rule-learning machine-learning-az

Updated Nov 5, 2017
Python

Nikronic / Machine-Learning-Models

Star

In This repository I made some simple to complex methods in machine learning. Here I try to build template style code.

Updated Nov 7, 2021
Python

niffler92 / Bandit

Star

Bandit algorithms

simulation thompson-sampling multiarm-bandit contextual-bandit bandit-algorithms linucb

Updated Oct 12, 2017
Python

antoine-hochart / bandit_algo_evaluation

Star

Offline evaluation of multi-armed bandit algorithms

thompson-sampling epsilon-greedy policy-evaluation multi-armed-bandit upper-confidence-bound

Updated Dec 1, 2020
Python

nphdang / Bandit-BO

Star

Bayesian Optimization for Categorical and Continuous Inputs

machine-learning optimization thompson-sampling hyperparameter-optimization hyperopt gaussian-processes bayesian-optimization multi-armed-bandits hyperparameter-tuning automl automated-machine-learning smac categorical-variables continuous-variable acquisition-functions gpyopt batch-bayesian-optimization

Updated Jul 20, 2020
Python

akshaykhadse / reinforcement-learning

Star

Implementations of basic concepts dealt under the Reinforcement Learning umbrella. This project is collection of assignments in CS747: Foundations of Intelligent and Learning Agents (Autumn 2017) at IIT Bombay

reinforcement-learning linear-programming thompson-sampling epsilon-greedy ucb policy-evaluation mdps multi-armed-bandits policy-iteration randomised-algorithms reinforcement-learning-excercises kl-divergence markovian-epidemic-processes reinforcement-learning-analysis multiarm-bandit ucb1 howards-pi batch-switching randomized-policy-iteration

Updated May 21, 2018
Python

ishank-juneja / Correlated-AoI-Bandits

Star

Author's implementation of the paper Correlated Age-of-Information Bandits.

thompson-sampling ucb multi-armed-bandit aoi age-of-information correlated-multi-armed-bandits correlated-arms aoi-regret

Updated Jun 19, 2021
Python

annieyan / Bandits-using-UCB-algorithm

Star

Thompson Sampling for Bandits using UCB policy

reinforcement-learning thompson-sampling ucb bandits

Updated Jul 29, 2017
Python

Amshra267 / Thompson-Greedy-Comparison-for-MultiArmed-Bandits

Star

Repository Containing Comparison of two methods for dealing with Exploration-Exploitation dilemma for MultiArmed Bandits

thompson-sampling epsilon-greedy exploration-exploitation optimistic-bayesian-sampling

Updated Jul 2, 2021
Python

LukasRinder / bayesian-neural-networks

Star

Different implementations of Bayesian neural networks for uncertainty estimation. The uncertainty estimation is utilized for efficient exploration in reinforcement learning.

reinforcement-learning thompson-sampling deep-q-network uncertainty-estimation bayesian-neural-networks