Skip to content

Commit

Permalink
Update bonus section formatting (rasbt#400)
Browse files Browse the repository at this point in the history
  • Loading branch information
rasbt authored Oct 12, 2024
1 parent 233a3b0 commit b6c4b2f
Show file tree
Hide file tree
Showing 7 changed files with 23 additions and 5 deletions.
9 changes: 8 additions & 1 deletion ch01/README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,15 @@
# Chapter 1: Understanding Large Language Models


 
## Main Chapter Code

There is no code in this chapter.

<br>

&nbsp;
## Bonus Materials

As optional bonus material, below is a video tutorial where I explain the LLM development lifecycle covered in this book:

<br>
Expand Down
3 changes: 2 additions & 1 deletion ch02/README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
# Chapter 2: Working with Text Data


&nbsp;
## Main Chapter Code

- [01_main-chapter-code](01_main-chapter-code) contains the main chapter code and exercise solutions

&nbsp;
## Bonus Materials

- [02_bonus_bytepair-encoder](02_bonus_bytepair-encoder) contains optional code to benchmark different byte pair encoder implementations
Expand Down
2 changes: 2 additions & 0 deletions ch03/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
# Chapter 3: Coding Attention Mechanisms

&nbsp;
## Main Chapter Code

- [01_main-chapter-code](01_main-chapter-code) contains the main chapter code.

&nbsp;
## Bonus Materials

- [02_bonus_efficient-multihead-attention](02_bonus_efficient-multihead-attention) implements and compares different implementation variants of multihead-attention
Expand Down
7 changes: 5 additions & 2 deletions ch04/README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
# Chapter 4: Implementing a GPT Model from Scratch to Generate Text

&nbsp;
## Main Chapter Code

- [01_main-chapter-code](01_main-chapter-code) contains the main chapter code.

## Optional Code
&nbsp;
## Bonus Materials

- [02_performance-analysis](02_performance-analysis) contains optional code analyzing the performance of the GPT model(s) implemented in the main chapter.
- [02_performance-analysis](02_performance-analysis) contains optional code analyzing the performance of the GPT model(s) implemented in the main chapter
- [ch05/07_gpt_to_llama](../ch05/07_gpt_to_llama) contains a step-by-step guide for converting a GPT architecture implementation to Llama 3.2 and loads pretrained weights from Meta AI (it might be interesting to look at alternative architectures after completing chapter 4, but you can also save that for after reading chapter 5)

2 changes: 2 additions & 0 deletions ch05/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
# Chapter 5: Pretraining on Unlabeled Data

&nbsp;
## Main Chapter Code

- [01_main-chapter-code](01_main-chapter-code) contains the main chapter code

&nbsp;
## Bonus Materials

- [02_alternative_weight_loading](02_alternative_weight_loading) contains code to load the GPT model weights from alternative places in case the model weights become unavailable from OpenAI
Expand Down
3 changes: 2 additions & 1 deletion ch06/README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
# Chapter 6: Finetuning for Classification


&nbsp;
## Main Chapter Code

- [01_main-chapter-code](01_main-chapter-code) contains the main chapter code

&nbsp;
## Bonus Materials

- [02_bonus_additional-experiments](02_bonus_additional-experiments) includes additional experiments (e.g., training the last vs first token, extending the input length, etc.)
Expand Down
2 changes: 2 additions & 0 deletions ch07/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
# Chapter 7: Finetuning to Follow Instructions

&nbsp;
## Main Chapter Code

- [01_main-chapter-code](01_main-chapter-code) contains the main chapter code and exercise solutions

&nbsp;
## Bonus Materials

- [02_dataset-utilities](02_dataset-utilities) contains utility code that can be used for preparing an instruction dataset
Expand Down

0 comments on commit b6c4b2f

Please sign in to comment.