jon-tow

🐨

Jonathan Tow jon-tow

🐨

77 followers · 4 following

https://jon-tow.github.io

Achievements

Organizations

Stars

ethansmith2000 / fsdp_optimizers

supporting pytorch FSDP for optimizers

Python 84 4 Updated Dec 8, 2024

apple / ml-cross-entropy

Python 519 45 Updated Sep 15, 2025

EleutherAI / nanoGPT-mup

Forked from karpathy/nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 160 14 Updated Jun 27, 2025

mgmalek / efficient_cross_entropy

Python 118 8 Updated May 28, 2024

gpt-omni / mini-omni2

Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。

Python 1,808 199 Updated Jan 16, 2025

facebookresearch / lingua

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,710 271 Updated Jul 18, 2025

GAIR-NLP / ProX

[ICML 2025] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale

Python 263 19 Updated Jul 8, 2025

pytorch / torchtitan

A PyTorch native platform for training generative AI models

Python 4,389 511 Updated Sep 15, 2025

kyutai-labs / moshi

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 8,906 788 Updated Sep 15, 2025

JetBrains-Research / code-summarization-dataset

Kotlin 8 2 Updated Jun 18, 2021

OpenMachine-ai / transformer-tricks

A collection of tricks and tools to speed up transformer models

TeX 178 10 Updated Sep 8, 2025

yifanzhang-pro / AutoMathText

Official implementation of ACL 2025 Findings paper "Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts" (As Huggingface Daily Papers: https://huggingface.co/pape…

Python 85 5 Updated Sep 6, 2025

huggingface / nanotron

Minimalistic large language model 3D-parallelism training

Python 2,200 239 Updated Sep 3, 2025

haileyschoelkopf / megablocks

Forked from databricks/megablocks

Python 4 Updated Apr 4, 2024

ChenghaoMou / text-dedup

All-in-one text de-duplication

Python 714 73 Updated Aug 31, 2025

NVIDIA / nccl-tests

NCCL Tests

Cuda 1,260 316 Updated Sep 5, 2025

Azure / MS-AMP

Microsoft Automatic Mixed Precision Library

Python 619 48 Updated Sep 29, 2024

coreweave / gutenberg-epub

Go 5 Updated Oct 1, 2023

taylorai / galactic

data cleaning and curation for unstructured text

Python 328 17 Updated Aug 6, 2024

iwiwi / epochraft

Checkpointable dataset utilities for foundation model training

Python 32 5 Updated Jan 29, 2024

meta-llama / llama-cookbook

Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…

Jupyter Notebook 17,850 2,608 Updated Sep 10, 2025

Guitaricet / relora

Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates

Jupyter Notebook 463 40 Updated Apr 21, 2024

jquesnelle / yarn

YaRN: Efficient Context Window Extension of Large Language Models

Python 1,606 128 Updated Apr 17, 2024

facebookincubator / submitit

Python 3.8+ toolbox for submitting jobs to Slurm

Python 1,503 140 Updated May 21, 2025

huggingface / hf_transfer

Rust 509 37 Updated Apr 11, 2025

Cerebras / modelzoo

Python 1,065 151 Updated Sep 2, 2025

alasdairforsythe / tokenmonster

Ungreedy subword tokenizer and vocabulary trainer for Python, Go & Javascript

Go 600 20 Updated Jul 2, 2024

EleutherAI / concept-erasure

Erasing concepts from neural representations with provable guarantees

Python 233 15 Updated Jan 27, 2025

epfml / landmark-attention

Landmark Attention: Random-Access Infinite Context Length for Transformers

Python 425 35 Updated Dec 20, 2023

shayne-longpre / a-pretrainers-guide

72 1 Updated May 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jonathan Tow jon-tow

Achievements

Achievements

Organizations

Block or report jon-tow

Stars

ethansmith2000 / fsdp_optimizers

apple / ml-cross-entropy

EleutherAI / nanoGPT-mup

mgmalek / efficient_cross_entropy

gpt-omni / mini-omni2

facebookresearch / lingua

GAIR-NLP / ProX

pytorch / torchtitan

kyutai-labs / moshi

JetBrains-Research / code-summarization-dataset

OpenMachine-ai / transformer-tricks

yifanzhang-pro / AutoMathText

huggingface / nanotron

haileyschoelkopf / megablocks

ChenghaoMou / text-dedup

NVIDIA / nccl-tests

Azure / MS-AMP

coreweave / gutenberg-epub

taylorai / galactic

iwiwi / epochraft

meta-llama / llama-cookbook

Guitaricet / relora

jquesnelle / yarn

facebookincubator / submitit

huggingface / hf_transfer

Cerebras / modelzoo

alasdairforsythe / tokenmonster

EleutherAI / concept-erasure

epfml / landmark-attention

shayne-longpre / a-pretrainers-guide