thsno02

thsno02

6 followers · 5 following

Achievements

Lists (1)

Sort

kg2e

9 repositories

Starred repositories

togethercomputer / RedPajama-Data

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Python 4,661 354 Updated Dec 7, 2024

Zjh-819 / LLMDataHub

A quick guide (especially) for trending instruction finetuning datasets

2,890 187 Updated Nov 28, 2023

karpathy / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 39,695 6,503 Updated Dec 9, 2024

openai / gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"

Python 23,102 5,612 Updated Aug 14, 2024

esbatmop / MNBVC

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化，也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

3,738 262 Updated Feb 18, 2025

lxs602 / Chinese-Mandarin-Dictionaries

中文词典 / 中文詞典。Chinese / Chinese-English dictionaries.

HTML 157 28 Updated Apr 15, 2024

XiongjieDai / GPU-Benchmarks-on-LLM-Inference

Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?

Jupyter Notebook 1,422 55 Updated May 13, 2024

corca-ai / awesome-llm-security

A curation of awesome tools, documents and projects about LLM Security.

1,082 118 Updated Feb 23, 2025

karpathy / minGPT

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

Python 21,463 2,781 Updated Aug 15, 2024

ari-holtzman / degen

Official Repository for "The Curious Case of Neural Text Degeneration"

HTML 160 17 Updated Apr 18, 2023

Ucas-HaoranWei / Vary

[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.

Python 1,812 146 Updated Dec 30, 2024

AILab-CVC / UniRepLKNet

[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition

Python 971 57 Updated Oct 24, 2024

h2oai / h2o-llmstudio

H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/

Python 4,198 438 Updated Feb 26, 2025

microsoft / AI-For-Beginners

12 Weeks, 24 Lessons, AI for All!

Jupyter Notebook 36,223 6,457 Updated Feb 13, 2025

lobehub / lobe-chat

🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / DeepSeek / Qwen), Knowledge Base (file upload / knowledge managemen…

TypeScript 56,533 12,056 Updated Feb 28, 2025

openai / weak-to-strong

Python 2,521 310 Updated May 19, 2024

OpenLMLab / MOSS-RLHF

Secrets of RLHF in Large Language Models Part I: PPO

Python 1,320 96 Updated Mar 3, 2024

labmlai / annotated_deep_learning_paper_implementations

🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…

Python 58,841 5,977 Updated Aug 24, 2024

fwwdn / sensitive-stop-words

互联网常用敏感词、停止词词库

1,361 636 Updated Jun 4, 2024

jkiss / sensitive-words

Forked from fwwdn/sensitive-stop-words

互联网常用敏感词库

346 187 Updated Dec 4, 2018

thsno02

Lists (1)

kg2e

Starred repositories

chatglm-6b

ChatGPT