Skip to content

Commit d920b00

Browse files
committed
add table of content
1 parent fe35f3f commit d920b00

File tree

1 file changed

+48
-20
lines changed

1 file changed

+48
-20
lines changed

README.md

Lines changed: 48 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,32 @@ A curated (still actively updated) list of practical guide resources of LLMs. It
99

1010
These sources aim to help practitioners navigate the vast landscape of large language models (LLMs) and their applications in natural language processing (NLP) applications. If you find any resources in our repository helpful, please feel free to use them (and don't forget to cite our paper!)
1111

12+
* [The Practical Guides for Large Language Models ](#the-practical-guides-for-large-language-models-)
13+
* [Latest News<g-emoji class="g-emoji" alias="boom" fallback-src="https://github.githubassets.com/images/icons/emoji/unicode/1f4a5.png">💥</g-emoji>](#latest-news)
14+
* [Other Practical Guides for LLMs](#other-practical-guides-for-llms)
15+
* [Practical Guide for Models](#practical-guide-for-models)
16+
* [BERT-style Language Models: Encoder-Decoder or Encoder-only](#bert-style-language-models-encoder-decoder-or-encoder-only)
17+
* [GPT-style Language Models: Decoder-only](#gpt-style-language-models-decoder-only)
18+
* [Practical Guide for Data](#practical-guide-for-data)
19+
* [Pretraining data](#pretraining-data)
20+
* [Finetuning data](#finetuning-data)
21+
* [Test data/user data](#test-datauser-data)
22+
* [Practical Guide for NLP Tasks](#practical-guide-for-nlp-tasks)
23+
* [Traditional NLU tasks](#traditional-nlu-tasks)
24+
* [Generation tasks](#generation-tasks)
25+
* [Knowledge-intensive tasks](#knowledge-intensive-tasks)
26+
* [Abilities with Scaling](#abilities-with-scaling)
27+
* [Specific tasks](#specific-tasks)
28+
* [Real-World ''Tasks''](#real-world-tasks)
29+
* [Efficiency](#efficiency)
30+
* [Trustworthiness](#trustworthiness)
31+
* [Benchmark Instruction Tuning](#benchmark-instruction-tuning)
32+
* [Alignment](#alignment)
33+
* [Safety Alignment (Harmless)](#safety-alignment-harmless)
34+
* [Truthfulness Alignment (Honest)](#truthfulness-alignment-honest)
35+
* [Practical Guides for Prompting (Helpful)](#practical-guides-for-prompting-helpful)
36+
* [Alignment Efforts of Open-source Communtity](#alignment-efforts-of-open-source-communtity)
37+
1238
## Latest News💥
1339
- We used PowerPoint to plot the figure and released the source file [pptx](./source/figure_gif.pptx) for our GIF figure. [4/27/2023]
1440
- We released the source file for the still version [pptx](./source/figure_still.pptx), and replaced the figure in this repo with the still version. [4/29/2023]
@@ -26,6 +52,11 @@ We welcome pull requests to refine this figure, and if you find the source helpf
2652
primaryClass={cs.CL}
2753
}
2854
```
55+
## Other Practical Guides for LLMs
56+
57+
- **Why did all of the public reproduction of GPT-3 fail? In which tasks should we use GPT-3.5/ChatGPT?** 2023, [Blog](https://jingfengyang.github.io/gpt)
58+
- **Building LLM applications for production**, 2023, [Blog](https://huyenchip.com/2023/04/11/llm-engineering.html)
59+
- **Data-centric Artificial Intelligence**, 2023, [Repo](https://github.com/daochenzha/data-centric-AI)/[Blog](https://towardsdatascience.com/what-are-the-data-centric-ai-concepts-behind-gpt-models-a590071bb727)/[Paper](https://arxiv.org/abs/2303.10158)
2960

3061
## Practical Guide for Models
3162

@@ -35,11 +66,6 @@ We build an evolutionary tree of modern Large Language Models (LLMs) to trace th
3566
<img width="600" src="./imgs/models-colorgrey.jpg"/>
3667
</p>
3768

38-
### Other Practical Guides for LLMs
39-
- Why did all of the public reproduction of GPT-3 fail? In which tasks should we use GPT-3.5/ChatGPT? 2023, [Blog](https://jingfengyang.github.io/gpt)
40-
- Building LLM applications for production, 2023, [Blog](https://huyenchip.com/2023/04/11/llm-engineering.html)
41-
- Data-centric Artificial Intelligence, 2023, [Repo](https://github.com/daochenzha/data-centric-AI)/[Blog](https://towardsdatascience.com/what-are-the-data-centric-ai-concepts-behind-gpt-models-a590071bb727)/[Paper](https://arxiv.org/abs/2303.10158)
42-
4369
### BERT-style Language Models: Encoder-Decoder or Encoder-only
4470

4571
- BERT **BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding**, 2018, [Paper](https://aclanthology.org/N19-1423.pdf)
@@ -48,9 +74,9 @@ We build an evolutionary tree of modern Large Language Models (LLMs) to trace th
4874
- ALBERT **ALBERT: A Lite BERT for Self-supervised Learning of Language Representations**, 2019, [Paper](https://arxiv.org/abs/1909.11942)
4975
- UniLM **Unified Language Model Pre-training for Natural Language Understanding and Generation**, 2019 [Paper](https://arxiv.org/abs/1905.03197)
5076
- ELECTRA **ELECTRA: PRE-TRAINING TEXT ENCODERS AS DISCRIMINATORS RATHER THAN GENERATORS**, 2020, [Paper](https://openreview.net/pdf?id=r1xMH1BtvB)
51-
- T5 **"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"**. *Colin Raffel et al.* JMLR 2019. [Paper](https://arxiv.org/abs/1910.10683)]
52-
- GLM **"GLM-130B: An Open Bilingual Pre-trained Model"**. 2022. [Paper](https://arxiv.org/abs/2210.02414)]
53-
- AlexaTM **"AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model"**. *Saleh Soltan et al.* arXiv 2022. [Paper](https://arxiv.org/abs/2208.01448)]
77+
- T5 **"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"**. *Colin Raffel et al.* JMLR 2019. [Paper](https://arxiv.org/abs/1910.10683)
78+
- GLM **"GLM-130B: An Open Bilingual Pre-trained Model"**. 2022. [Paper](https://arxiv.org/abs/2210.02414)
79+
- AlexaTM **"AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model"**. *Saleh Soltan et al.* arXiv 2022. [Paper](https://arxiv.org/abs/2208.01448)
5480
- ST-MoE **ST-MoE: Designing Stable and Transferable Sparse Expert Models**. 2022 [Paper](https://arxiv.org/abs/2202.08906)
5581

5682

@@ -61,7 +87,7 @@ We build an evolutionary tree of modern Large Language Models (LLMs) to trace th
6187
- GPT-3 **"Language Models are Few-Shot Learners"**. NeurIPS 2020. [Paper](https://arxiv.org/abs/2005.14165)
6288
- OPT **"OPT: Open Pre-trained Transformer Language Models"**. 2022. [Paper](https://arxiv.org/abs/2205.01068)
6389
- PaLM **"PaLM: Scaling Language Modeling with Pathways"**. *Aakanksha Chowdhery et al.* arXiv 2022. [Paper](https://arxiv.org/abs/2204.02311)
64-
- BLOOM **"BLOOM: A 176B-Parameter Open-Access Multilingual Language Model"**. 2022. [Paper](https://arxiv.org/abs/2211.05100)]
90+
- BLOOM **"BLOOM: A 176B-Parameter Open-Access Multilingual Language Model"**. 2022. [Paper](https://arxiv.org/abs/2211.05100)
6591
- MT-NLG **"Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model"**. 2021. [Paper](https://arxiv.org/abs/2201.11990)
6692
- GLaM **"GLaM: Efficient Scaling of Language Models with Mixture-of-Experts"**. ICML 2022. [Paper](https://arxiv.org/abs/2112.06905)
6793
- Gopher **"Scaling Language Models: Methods, Analysis & Insights from Training Gopher"**. 2021. [Paper](http://arxiv.org/abs/2112.11446v2)
@@ -79,6 +105,7 @@ We build an evolutionary tree of modern Large Language Models (LLMs) to trace th
79105

80106

81107
### Pretraining data
108+
- **RedPajama**, 2023. [Repo](https://github.com/togethercomputer/RedPajama-Data)
82109
- **The Pile: An 800GB Dataset of Diverse Text for Language Modeling**, Arxiv 2020. [Paper](https://arxiv.org/abs/2101.00027)
83110
- **How does the pre-training objective affect what large language models learn about linguistic properties?**, ACL 2022. [Paper](https://aclanthology.org/2022.acl-short.16/)
84111
- **Scaling laws for neural language models**, 2020. [Paper](https://arxiv.org/abs/2001.08361)
@@ -228,20 +255,21 @@ We build a decision flow for choosing LLMs or fine-tuned models~\protect\footnot
228255

229256
#### Practical Guides for Prompting (Helpful)
230257

231-
- OpenAI Cookbook. [Blog](https://github.com/openai/openai-cookbook/blob/main/techniques_to_improve_reliability.md)
232-
- Prompt Engineering. [Blog](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/)
233-
- ChatGPT Prompt Engineering for Developers! [Course](https://www.deeplearning.ai/short-courses/chatgpt-prompt-engineering-for-developers/)
258+
- **OpenAI Cookbook**. [Blog](https://github.com/openai/openai-cookbook/blob/main/techniques_to_improve_reliability.md)
259+
- **Prompt Engineering**. [Blog](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/)
260+
- **ChatGPT Prompt Engineering for Developers!** [Course](https://www.deeplearning.ai/short-courses/chatgpt-prompt-engineering-for-developers/)
234261

235262
#### Alignment Efforts of Open-source Communtity
236263

237264
- **Self-Instruct: Aligning Language Model with Self Generated Instructions**, Arxiv 2022 [Paper](https://arxiv.org/abs/2212.10560)
238-
- Alpaca. [Repo](https://github.com/tatsu-lab/stanford_alpaca)
239-
- Vicuna. [Repo](https://github.com/lm-sys/FastChat)
240-
- Dolly. [Blog](https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm)
241-
- DeepSpeed-Chat. [Blog](https://github.com/microsoft/DeepSpeedExamples/tree/master/applications/DeepSpeed-Chat)
242-
- GPT4All. [Repo](https://github.com/nomic-ai/gpt4all)
243-
- OpenAssitant. [Repo](https://github.com/LAION-AI/Open-Assistant)
244-
- ChatGLM. [Repo](https://github.com/THUDM/ChatGLM-6B)
245-
- MOSS. [Repo](https://github.com/OpenLMLab/MOSS)
265+
- **Alpaca**. [Repo](https://github.com/tatsu-lab/stanford_alpaca)
266+
- **Vicuna**. [Repo](https://github.com/lm-sys/FastChat)
267+
- **Dolly**. [Blog](https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm)
268+
- **DeepSpeed-Chat**. [Blog](https://github.com/microsoft/DeepSpeedExamples/tree/master/applications/DeepSpeed-Chat)
269+
- **GPT4All**. [Repo](https://github.com/nomic-ai/gpt4all)
270+
- **OpenAssitant**. [Repo](https://github.com/LAION-AI/Open-Assistant)
271+
- **ChatGLM**. [Repo](https://github.com/THUDM/ChatGLM-6B)
272+
- **MOSS**. [Repo](https://github.com/OpenLMLab/MOSS)
273+
- **Lamini**. [Repo](https://github.com/lamini-ai/lamini/)/[Blog](https://lamini.ai/blog/introducing-lamini)
246274

247275

0 commit comments

Comments
 (0)