Skip to content
View jon-tow's full-sized avatar
🐨
🐨

Organizations

@EleutherAI

Block or report jon-tow

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

supporting pytorch FSDP for optimizers

Python 84 4 Updated Dec 8, 2024
Python 519 45 Updated Sep 15, 2025

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 160 14 Updated Jun 27, 2025

Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。

Python 1,808 199 Updated Jan 16, 2025

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,710 271 Updated Jul 18, 2025

[ICML 2025] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale

Python 263 19 Updated Jul 8, 2025

A PyTorch native platform for training generative AI models

Python 4,389 511 Updated Sep 15, 2025

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 8,906 788 Updated Sep 15, 2025

A collection of tricks and tools to speed up transformer models

TeX 178 10 Updated Sep 8, 2025

Official implementation of ACL 2025 Findings paper "Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts" (As Huggingface Daily Papers: https://huggingface.co/pape…

Python 85 5 Updated Sep 6, 2025

Minimalistic large language model 3D-parallelism training

Python 2,200 239 Updated Sep 3, 2025
Python 4 Updated Apr 4, 2024

All-in-one text de-duplication

Python 714 73 Updated Aug 31, 2025

NCCL Tests

Cuda 1,260 316 Updated Sep 5, 2025

Microsoft Automatic Mixed Precision Library

Python 619 48 Updated Sep 29, 2024
Go 5 Updated Oct 1, 2023

data cleaning and curation for unstructured text

Python 328 17 Updated Aug 6, 2024

Checkpointable dataset utilities for foundation model training

Python 32 5 Updated Jan 29, 2024

Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…

Jupyter Notebook 17,850 2,608 Updated Sep 10, 2025

Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates

Jupyter Notebook 463 40 Updated Apr 21, 2024

YaRN: Efficient Context Window Extension of Large Language Models

Python 1,606 128 Updated Apr 17, 2024

Python 3.8+ toolbox for submitting jobs to Slurm

Python 1,503 140 Updated May 21, 2025
Rust 509 37 Updated Apr 11, 2025
Python 1,065 151 Updated Sep 2, 2025

Ungreedy subword tokenizer and vocabulary trainer for Python, Go & Javascript

Go 600 20 Updated Jul 2, 2024

Erasing concepts from neural representations with provable guarantees

Python 233 15 Updated Jan 27, 2025

Landmark Attention: Random-Access Infinite Context Length for Transformers

Python 425 35 Updated Dec 20, 2023
Next