My LLM

All about large language models

My Practice

my-alpaca reproduce alpaca
multi-turn-alpaca train alpaca with multi-turn dialogue datasets
alpaca-rlhf train multi-turn alpaca with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat
my-autocrit experiments using autocrit
try-large-models try large models
my-rl learn reinforcement learning using tianshou

My Articles

ChatGPT-Techniques-Introduction-for-Everyone

Pre-train

Models

T5
- Paper
- Architecture
  - Encoder-Decoder
GPT
- Paper
  - GPT
  - GPT-2
  - GPT-3
GPT-Neo
GPT-J-6B
Megatron-11B
Pangu-a-13B
FairSeq
GLaM
- Paper
LaMDA
- Paper
JURASSIC-1
- Paper
MT-NLG
- Paper
ERNIE
- Paper
Gopher
- Paper
- Conclusion
  - Gains from scale are largest in areas such as reading comprehension, fact-checking, and the identification of toxic language, but logical and mathematical reasoning see less benefit.
Chinchilla
- Paper
- Conclusion
  - We find that current large language models are significantly under trained, a consequence of the recent focus on scaling language models whilst keeping the amount of training data constant.
  - we find that for compute-optimal training, the model size and the number of training tokens should be scaled equally: for every doubling of model size the number of training tokens should also be doubled.
PaLM
- Paper
- Architecture
  - Decoder
PaLM 2
- Blog
- PaLM 2 Technical Report
OPT
- Paper
- Architecture
  - Decoder
Gpt-neox
- Paper
- GitHub
- Architecture
  - Decoder
BLOOM
- Paper
- Architecture
  - Decoder
LLaMA
- Paper
- Model
- Architecture
  - Decoder
GLM
- Paper
  - 2022-ACL-GLM- General Language Model Pretraining with Autoregressive Blank Infilling paper
    - GitHub
  - 2023-ICLR-GLM-130B- An Open Bilingual Pre-trained Model paper
    - GitHub
    - Architecture
      - Autoregressive Blank Infilling
BloombergGPT
- Paper
MOSS
- GitHub
OpenLLaMA: An Open Reproduction of LLaMA
- GitHub
dolly
- GitHub
panda
- GitHub
- Paper
WeLM
- Paper

Survey

2023-A Survey of Large Language Models [paper]

Methods

Max Sequence Length

Blog
- Transformer升级之路：7、长度外推性与局部注意力
Paper
- 2023-Scaling Transformer to 1M tokens and beyond with RMT [paper]
  - 2022-NIPS-Recurrent Memory Transformer [paper]
- 2022-Parallel Context Windows Improve In-Context Learning of Large Language Models [paper]

Position

Normalization

RMSNorm
Layer Normalization
- Pre-LN
- Post-LN
- Sandwich-LN
- DeepNorm

Activation Function

SwiGLU
GeLUs
Swish

Tokenizer

BPE paper

Interpretability

Transformer Circuits Thread

LR Scheduler

2020-Scaling Laws for Neural Language Models [paper]

Fine-tune

Models

General

T0
- Paper
FLAN
- Paper
- GitHub
Flan-LM
- Paper
BLOOMZ & mT0
- Paper
ChatGPT
- Blog
Alpaca: A Strong, Replicable Instruction-Following Model
- Site
- GitHub
Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality
- GitHub
- Site
- Online Demo
Koala: A Dialogue Model for Academic Research
- Blog
- GitHub
  - Koala_data_pipeline
  - Koala Evaluation Set
alpaca-lora
- GitHub
ChatGLM-6B
- GitHub
- Blog
Firefly
- GitHub
thai-buffala-lora-7b-v0-1
- Model
multi-turn-alpaca
- GitHub
Open-Assistant
- Site
- GitHub
- Paper

Chinese

Chinese-ChatLLaMA
- GitHub
- Blog
  - 训练中文LLaMA大规模语言模型
  - ChatLLaMA：用指令微调训练中文对话大模型
BELLE
- GitHub
Chinese-LLaMA-Alpaca
- GitHub
Luotuo-Chinese-LLM
- GitHub
Chinese-Vicuna
- GitHub
Chinese-alpaca-lora
- GitHub

Japanese

Japanese-Alpaca-LoRA
- GitHub

Medical

2023-ChatDoctor: A medical chat model fine-tuned on llama model using medical domain knowledge
- Paper
华驼(HuaTuo): 基于中文医学知识的LLaMA微调模型
- GitHub

Law

LawGPT_zh：中文法律大模型（獬豸）
- GitHub

Recommendation

2023-Recalpaca: Low-rank llama instruct-tuning for recommendation

Other

2023-A Survey of Domain Specialization for Large Language Models [paper]

Methods

RL

2017-Proximal Policy Optimization Algorithms [paper]
- Why is the log probability replaced with the importance sampling in the loss function?
2016-Asynchronous methods for deep reinforcement learning [paper]
2015-High-dimensional continuous control using generalized advantage estimation [paper]
2015-mlr-Trust Region Policy Optimization [paper]

Reward Modeling

2023-REWARD DESIGN WITH LANGUAGE MODELS [paper]
2022-Scaling Laws for Reward Model Overoptimization [paper]
autocrit
- GitHub
- reward-modeling GitHub
2023-On The Fragility of Learned Reward Functions [paper]

peft

2021-LoRA- Low-Rank Adaptation of Large Language Models [paper]

align

2023-Preference Ranking Optimization for Human Alignment [paper]
2023-Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization [paper]
2023-Fine-Grained Human Feedback Gives Better Rewards for Language Model Training paper]
2023-Chain of Hindsight Aligns Language Models with Feedback [paper]
2023-Training Socially Aligned Language Models in Simulated Human Society [paper]
2023-Let’s Verify Step by Step [paper]
2023-The False Promise of Imitating Proprietary LLMs [paper]
2023-AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback [paper]
2023-LIMA- Less Is More for Alignment [paper]
2023-RRHF: Rank Responses to Align Language Models with Human Feedback without tears [paper] [code]
2022-Solving math word problems with process-and outcome-based feedback [paper]
2022-Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback [paper]
2022-Training language models to follow instructions with human feedback [paper]
- GitHub
2022-Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned [paper]
2022-LaMDA- Language Models for Dialog Applications [Paper]
2022-Constitutional ai- Harmlessness from ai feedback [paper]
2021-A general language assistant as a laboratory for alignment [paper]
2021-Ethical and social risks of harm from language models [paper]
2020-nips-Learning to summarize from human feedback [paper]
2019-Fine-Tuning Language Models from Human Preferences [paper]
2018-Scalable agent alignment via reward modeling: a research direction [paper]
Reinforcement Learning for Language Models Blog
2017-nips-Deep reinforcement learning from human preferences [paper]
2016-Concrete problems in ai safety [paper]

Other

2022-naacl-MetaICL- Learning to Learn In Context [paper]
2022-iclr-Multitask Prompted Training Enables Zero-Shot Task Generalization [paper]

Prompt Learning

2023-Tree of Thoughts: Deliberate Problem Solving with Large Language Models [paper]
2023-Guiding Large Language Models via Directional Stimulus Prompting [paper]
2023-ICLR-Self-Consistency Improves Chain of Thought Reasoning in Language Models [paper]
2023-Is Prompt All You Need No. A Comprehensive and Broader View of Instruction Learning [paper]

Survey

2021-Pre-train, Prompt, and Predict- A Systematic Survey of Prompting Methods in Natural Language Processing [paper]

Prompt Tuning

2023-Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition [paper]
2022-AC-PPT- Pre-trained Prompt Tuning for Few-shot Learning [paper]
2022-ACL-P-Tuning- Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks [paper]
2021-EMNLP-The Power of Scale for Parameter-Efficient Prompt Tuning [paper]
2021-acl-Prefix-Tuning- Optimizing Continuous Prompts for Generation [paper]
2021-GPT Understands, Too [paper]

Integrating External Data

Tool Learning

ToolLearningPapers

Methods

2023-OpenAGI: When LLM Meets Domain Experts [paper]
2023-WebCPM: Interactive Web Search for Chinese Long-form Question Answering [paper]
2023-Evaluating Verifiability in Generative Search Engines [paper]
2023-Enabling Large Language Models to Generate Text with Citations [paper]
langchain
- GitHub
  - langchain
  - Chinese-LangChain
2023-Check Your Facts and Try Again- Improving Large Language Models with External Knowledge and Automated Feedback [paper]
2022-Teaching language models to support answers with verified quotes
2021-Webgpt: Browser-assisted question-answering with human feedback [paper]
2021-Improving language models by retrieving from trillions of tokens
2020-REALM: retrieval-augmented language model pre-training
2020-Retrieval-augmented generation for knowledge-intensive NLP tasks

Other

如何为GPT/LLM模型添加额外知识？

Dataset

For Pre-training

RedPajama-Data
C4
Pile
ROOTS
Wudao Corpora
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
- GitHub
CSL: A Large-scale Chinese Scientific Literature Dataset 中文科学文献数据集
- GitHub
中文图书语料集合
- GitHub
Chinese Open Instruction Generalist (COIG)
- Paper
医疗数据集
- GitHub1
金融数据
- FinNLP-GitHub
- SmoothNLP 金融文本数据集(公开) | Public Financial Datasets for NLP Researches

For SFT

ChatAlpaca
- GitHub
InstructionZoo
- GitHub
FlagInstruct
fnlp/moss-002-sft-data
- Hugging Face Datasets

For Reward Model

For Evaluation

SuperCLUE：中文通用大模型综合性测评基准
Open LLMs benchmark大模型能力评测标准计划
中文医疗大模型评测基准-PromptCBLUE
GLUE、SuperGLUE、SQuAD、CoQA、WMT、LAMBADA、ROUGE、智源指数CUGE、MMLU、Hellaswag、OpenBookQA、ARC、TriviaQA、TruthfulQA

Methods

2023-A Pretrainer’s Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity [paper]
2023-DoReMi: Optimizing data mixtures speeds up language model pretraining
2023-Data selection for language models via importance resampling
2022-SELF-INSTRUCT- Aligning Language Model with Self Generated Instructions [paper]
2022-acl-Deduplicating training data makes language models better [paper]

Evaluation

2023-Harnessing the Power of LLMs in Practice- A Survey on ChatGPT and Beyond [paper]
2023-INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models [paper]
LLMZoo: a project that provides data, models, and evaluation benchmark for large language models.
- GitHub
2023-Evaluating ChatGPT's Information Extraction Capabilities- An Assessment of Performance, Explainability, Calibration, and Faithfulness paper
2023-Towards Better Instruction Following Language Models for Chinese- Investigating the Impact of Training Data and Evaluation paper
PandaLM
lm-evaluation-harness
BIG-bench
2023-HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models [paper]
2023-C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models [paper]
2023-Safety Assessment of Chinese Large Language Models [paper]
2022-Holistic Evaluation of Language Models [paper]

Aspects

helpfulness
honesty
harmlessness
truthfulness
robustness
Bias, Toxicity and Misinformation

评估挑战

已有的评估通常只用已有的常见NLP任务，海量的其它任务并没有评估，比如写邮件

Inference

Analysis

Pythia: Interpreting Autoregressive Transformers Across Time and Scale
- GitHub

Products

ChatGPT
文心一言
通义千问
AgentGPT
- GitHub
HuggingGPT
- GitHub
- Paper
AutoGPT
- GitHub
MiniGPT-4
- GitHub
- Paper
ShareGPT
- GitHub
character ai
- Site
LLaVA
- Paper
- Site
Video-LLaMA
- Paper
ChatPaper
- GitHub

Tools

Traditional Nlp Tasks

2023-AnnoLLM- Making Large Language Models to Be Better Crowdsourced Annotators [paper]
2022-Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks [paper]

Sentiment Analysis

2023-Sentiment Analysis in the Era of Large Language Models- A Reality Check [Paper] [GitHub]
2023-Can chatgpt understand too? A comparative study on chatgpt and fine-tuned BERT
2023-Is chatgpt a good sentiment analyzer? A preliminary study
2023-Llms to the moon? reddit market sentiment analysis with large language models

Other

Related Project

open-llms A list of open LLMs available for commercial use.
safe-rlhf
Awesome-Multimodal-Large-Language-Models
Awesome-Chinese-LLM

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
papers		papers
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

lpy1/my-llm

Folders and files

Latest commit

History

Repository files navigation

My LLM

My Practice

My Articles

Pre-train

Models

Survey

Methods

Max Sequence Length

Position

Normalization

Activation Function

Tokenizer

Interpretability

LR Scheduler

Fine-tune

Models

General

Chinese

Japanese

Medical

Law

Recommendation

Other

Methods

RL

Reward Modeling

peft

align

Other

Prompt Learning

Survey

Prompt Tuning

Integrating External Data

Tool Learning

Methods

Other

Dataset

For Pre-training

For SFT

For Reward Model

For Evaluation

Methods

Evaluation

Aspects

评估挑战

Inference

Analysis

Products

Tools

Traditional Nlp Tasks

Sentiment Analysis

Related Topics

Neural Text Generation

Controllable Generation

Distributed Training

Quantization

Other

Related Project

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages