Skip to content
View pr-Mais's full-sized avatar
👾
👾

Organizations

@firebase @googlemaps @invertase @fluttercommunity @FlutterVikings

Block or report pr-Mais

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)

Python 547 59 Updated May 9, 2024

Mastering Diverse Domains through World Models

Python 1,431 235 Updated Dec 7, 2024

Train transformer language models with reinforcement learning.

Python 10,428 1,345 Updated Dec 23, 2024

Schedule-Free Optimization in PyTorch

Python 2,022 69 Updated Dec 2, 2024

Fine-tune LLM agents with online reinforcement learning

Python 1,025 46 Updated Mar 19, 2024

A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).

Python 769 47 Updated Dec 26, 2024

Reference implementation for DPO (Direct Preference Optimization)

Python 2,282 189 Updated Aug 11, 2024

FastAPI framework, high performance, easy to learn, fast to code, ready for production

Python 78,942 6,761 Updated Dec 25, 2024

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

Python 9,397 1,734 Updated Dec 21, 2024

Powerful menu bar manager for macOS

Swift 15,265 282 Updated Oct 29, 2024
Python 1 Updated Nov 25, 2024

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

Python 7,734 671 Updated Dec 24, 2024

Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" presented by Zhiheng Xi et al.

Python 80 5 Updated Feb 9, 2024

Comprehensive toolkit for Reinforcement Learning from Human Feedback (RLHF) training, featuring instruction fine-tuning, reward model training, and support for PPO and DPO algorithms with various c…

Python 126 11 Updated Mar 18, 2024

Monitoring recent cross-research on LLM & RL on arXiv for control. If there are good papers, PRs are welcome.

237 10 Updated Sep 12, 2024
Python 139 15 Updated May 2, 2024

Curated tutorials and resources for Large Language Models, Text2SQL, Text2DSL、Text2API、Text2Vis and more.

2,022 151 Updated Oct 28, 2024

Data for paper "Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness"

Python 33 6 Updated May 3, 2023
Python 45 12 Updated Oct 25, 2024

✨A static blog template built with Astro.

Astro 1,609 362 Updated Dec 8, 2024

Sample code illustrating the VS Code extension API.

TypeScript 8,918 3,460 Updated Nov 27, 2024
Python 140 92 Updated Dec 26, 2024

🦌 Soothing pastel theme for VSCode & Azure Data Studio

TypeScript 1,487 51 Updated Dec 23, 2024

An innovative superfamily of fonts for code

TypeScript 14,601 250 Updated Dec 20, 2024

LLM101n: Let's build a Storyteller

30,656 1,673 Updated Aug 1, 2024

A Redis Plugin for GenKit that adds Redis for efficient state storage, trace storage, caching, and rate limiting.

TypeScript 6 Updated Jun 11, 2024
TypeScript 6 Updated Oct 2, 2024

Get up and running with Llama 3.3, Mistral, Gemma 2, and other large language models.

Go 104,375 8,341 Updated Dec 25, 2024
TypeScript 9 Updated Jun 12, 2024

A framework for few-shot evaluation of language models.

Python 7,303 1,971 Updated Dec 25, 2024
Next