Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run on RTX 4090 #38

Closed
wants to merge 413 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
413 commits
Select commit Hold shift + click to select a range
3fc7613
.
KellerJordan Oct 21, 2024
179023e
Merge branch 'master' of https://github.com/KellerJordan/modded-nanogpt
KellerJordan Oct 21, 2024
df1bc7a
Update README.md
KellerJordan Oct 21, 2024
5280949
.
KellerJordan Oct 21, 2024
48fdba5
Merge branch 'master' of https://github.com/KellerJordan/modded-nanogpt
KellerJordan Oct 21, 2024
d9249c6
Update README.md
KellerJordan Oct 21, 2024
35ddcd8
Update README.md
KellerJordan Oct 21, 2024
8c8e223
Update README.md
KellerJordan Oct 21, 2024
6143123
Update README.md
KellerJordan Oct 21, 2024
948c146
Update README.md
KellerJordan Oct 21, 2024
1bb361b
Update README.md
KellerJordan Oct 21, 2024
f1311a6
.
KellerJordan Oct 21, 2024
e5d0f4c
.
KellerJordan Oct 21, 2024
5984843
.
KellerJordan Oct 21, 2024
fc41694
.
KellerJordan Oct 21, 2024
a7a064a
.
KellerJordan Oct 21, 2024
f0bc125
.
KellerJordan Oct 21, 2024
3b74220
.
KellerJordan Oct 21, 2024
ffa1868
.
KellerJordan Oct 21, 2024
3ba77ea
Update README.md
KellerJordan Oct 21, 2024
0db7cb4
Update README.md
KellerJordan Oct 21, 2024
4e83467
Update README.md
KellerJordan Oct 21, 2024
e319979
.
KellerJordan Oct 21, 2024
c24926d
.
KellerJordan Oct 21, 2024
32a8bbf
.
KellerJordan Oct 21, 2024
7843fc6
.
KellerJordan Oct 21, 2024
ceff3ea
.
KellerJordan Oct 21, 2024
6c7773a
.
KellerJordan Oct 21, 2024
48d9509
Update README.md
KellerJordan Oct 21, 2024
90964a9
Update README.md
KellerJordan Oct 21, 2024
433c404
Update README.md
KellerJordan Oct 21, 2024
9586bd4
Update README.md
KellerJordan Oct 21, 2024
96e1b63
Update README.md
KellerJordan Oct 21, 2024
73b7b6c
release
KellerJordan Oct 21, 2024
58deae4
Merge branch 'master' of https://github.com/KellerJordan/modded-nanogpt
KellerJordan Oct 21, 2024
41a652d
Update README.md
KellerJordan Oct 21, 2024
16e4c2d
Update README.md
KellerJordan Oct 21, 2024
94ebcb3
Update README.md
KellerJordan Oct 21, 2024
148961b
.
KellerJordan Oct 21, 2024
0c2d819
.
KellerJordan Oct 21, 2024
8ef2d1c
Revert "release"
KellerJordan Oct 21, 2024
03a011c
.
KellerJordan Oct 21, 2024
3ec19b3
.
KellerJordan Oct 21, 2024
172643e
Update README.md
KellerJordan Oct 21, 2024
e48f4da
Update README.md
KellerJordan Oct 21, 2024
bcfd051
Update cached_fineweb10B.py
KellerJordan Oct 25, 2024
3323e87
.
KellerJordan Oct 25, 2024
a320f61
Merge branch 'master' of https://github.com/KellerJordan/modded-nanogpt
KellerJordan Oct 25, 2024
98de489
.
KellerJordan Oct 25, 2024
11d0733
.
KellerJordan Oct 29, 2024
ed1a431
.
KellerJordan Oct 29, 2024
9971d81
.
KellerJordan Oct 29, 2024
26dd7e0
.
KellerJordan Oct 29, 2024
9813521
.
KellerJordan Oct 29, 2024
4aa7d82
.
KellerJordan Oct 29, 2024
46b665d
.
KellerJordan Oct 29, 2024
48a8961
.
KellerJordan Oct 29, 2024
7f32cbc
.
KellerJordan Oct 29, 2024
45363bf
.
KellerJordan Oct 29, 2024
7218c45
.
KellerJordan Oct 30, 2024
742e3b6
.
KellerJordan Oct 30, 2024
4260cfd
.
KellerJordan Oct 30, 2024
208f42f
.
KellerJordan Oct 30, 2024
8273670
.
KellerJordan Oct 30, 2024
be013f5
.
KellerJordan Oct 30, 2024
4721e05
.
KellerJordan Oct 30, 2024
e5a486d
.
KellerJordan Oct 30, 2024
391948b
.
KellerJordan Oct 30, 2024
00878ba
.
KellerJordan Oct 30, 2024
40e7d17
.
KellerJordan Oct 30, 2024
76159e1
.
KellerJordan Oct 30, 2024
0e7c2e3
.
KellerJordan Oct 30, 2024
982ba39
.
KellerJordan Oct 30, 2024
b17c986
.
KellerJordan Oct 30, 2024
e0df6c0
.
KellerJordan Oct 30, 2024
a65bc53
.
KellerJordan Oct 30, 2024
6762fc7
.
KellerJordan Oct 30, 2024
cda3cdc
.
KellerJordan Oct 30, 2024
ad4fdc3
.
KellerJordan Nov 3, 2024
f505639
.
KellerJordan Nov 3, 2024
5da232a
.
KellerJordan Nov 3, 2024
0de0eef
.
KellerJordan Nov 3, 2024
4eaa595
.
KellerJordan Nov 3, 2024
0ee2f9a
.
KellerJordan Nov 3, 2024
b723811
.
KellerJordan Nov 3, 2024
ec9587c
.
KellerJordan Nov 3, 2024
a8787ad
.
KellerJordan Nov 3, 2024
0f464f3
.
KellerJordan Nov 4, 2024
58279bf
.
KellerJordan Nov 4, 2024
bba4bb9
.
KellerJordan Nov 4, 2024
ac2c3af
.
KellerJordan Nov 4, 2024
3ca6967
.
KellerJordan Nov 4, 2024
386553b
add MIT license
KellerJordan Nov 5, 2024
0d661bd
.
KellerJordan Nov 5, 2024
8fbfa55
.
KellerJordan Nov 5, 2024
81ded9f
.
KellerJordan Nov 5, 2024
e67296d
.
KellerJordan Nov 6, 2024
a6a86ab
.
KellerJordan Nov 6, 2024
1dcad7c
.
KellerJordan Nov 6, 2024
8bf3e08
.
KellerJordan Nov 6, 2024
cb851a1
.
KellerJordan Nov 6, 2024
8ff35a0
.
KellerJordan Nov 6, 2024
e46062d
.
KellerJordan Nov 6, 2024
912481e
.
KellerJordan Nov 6, 2024
71dafe7
.
KellerJordan Nov 6, 2024
0bee414
.
KellerJordan Nov 6, 2024
e3d0a8d
.
KellerJordan Nov 7, 2024
e01b457
.
KellerJordan Nov 7, 2024
319a23e
Update README.md
KellerJordan Nov 7, 2024
9946a2e
Update README.md
KellerJordan Nov 7, 2024
bc75dfa
Update README.md
KellerJordan Nov 8, 2024
453323e
Update README.md
KellerJordan Nov 8, 2024
a5622e5
Update README.md
KellerJordan Nov 8, 2024
42e26ee
Update README.md
KellerJordan Nov 8, 2024
d42297d
Update README.md
KellerJordan Nov 8, 2024
832b03f
Update README.md
KellerJordan Nov 8, 2024
7397995
Update README.md
KellerJordan Nov 8, 2024
802d76b
Update README.md
KellerJordan Nov 8, 2024
b18a911
Update README.md
KellerJordan Nov 8, 2024
b3aafcc
Update README.md
KellerJordan Nov 8, 2024
37b4787
Update README.md
KellerJordan Nov 8, 2024
0b80ac3
Update README.md
KellerJordan Nov 8, 2024
d2e3950
Update README.md
KellerJordan Nov 8, 2024
596917d
Update README.md
KellerJordan Nov 8, 2024
303e096
Update README.md
KellerJordan Nov 8, 2024
b0ad2c1
Update README.md
KellerJordan Nov 8, 2024
caf5c94
Update README.md
KellerJordan Nov 8, 2024
e21905f
Update README.md
KellerJordan Nov 8, 2024
0157a47
Update README.md
KellerJordan Nov 8, 2024
d22c770
Update README.md
KellerJordan Nov 8, 2024
0eb6d4b
Update README.md
KellerJordan Nov 8, 2024
e134368
Update README.md
KellerJordan Nov 8, 2024
8c2252f
.
KellerJordan Nov 9, 2024
a0dcbfd
.
KellerJordan Nov 9, 2024
c7bc6dc
.
KellerJordan Nov 9, 2024
8317279
Update README.md
KellerJordan Nov 9, 2024
1ea9c05
.
KellerJordan Nov 9, 2024
a8d7654
Merge branch 'master' of https://github.com/KellerJordan/modded-nanogpt
KellerJordan Nov 9, 2024
cd6b75e
.
KellerJordan Nov 9, 2024
088db86
Update README.md
KellerJordan Nov 9, 2024
7b2ca87
Update README.md
KellerJordan Nov 9, 2024
096e59f
Update README.md
KellerJordan Nov 9, 2024
a598325
Update README.md
KellerJordan Nov 9, 2024
61955b1
.
KellerJordan Nov 9, 2024
a86a2b5
Merge branch 'master' of https://github.com/KellerJordan/modded-nanogpt
KellerJordan Nov 9, 2024
fe8c862
.
KellerJordan Nov 9, 2024
4d772e7
.
KellerJordan Nov 9, 2024
ab0eb61
.
KellerJordan Nov 10, 2024
aa97945
.
KellerJordan Nov 10, 2024
c6ea6f3
Update README.md
KellerJordan Nov 10, 2024
ce12a1f
.
KellerJordan Nov 10, 2024
e2d099c
Merge branch 'master' of https://github.com/KellerJordan/modded-nanogpt
KellerJordan Nov 10, 2024
d7b1cd9
Update README.md
KellerJordan Nov 10, 2024
d52aafe
Update README.md
KellerJordan Nov 10, 2024
5364fa9
.
KellerJordan Nov 11, 2024
bcc607a
Merge branch 'master' of https://github.com/KellerJordan/modded-nanogpt
KellerJordan Nov 11, 2024
49e5bff
.
KellerJordan Nov 11, 2024
b473be6
.
KellerJordan Nov 11, 2024
6d050c5
.
KellerJordan Nov 11, 2024
d74dc46
.
KellerJordan Nov 11, 2024
a4b40a5
.
KellerJordan Nov 11, 2024
7c11b5e
.
KellerJordan Nov 11, 2024
b29a05a
Update README.md
KellerJordan Nov 11, 2024
78b1eee
.
KellerJordan Nov 11, 2024
2d58c1a
Merge branch 'master' of https://github.com/KellerJordan/modded-nanogpt
KellerJordan Nov 11, 2024
5cab023
.
KellerJordan Nov 11, 2024
3b15911
.
KellerJordan Nov 11, 2024
1f690d7
.
KellerJordan Nov 11, 2024
458a02d
.
KellerJordan Nov 11, 2024
599e345
.
KellerJordan Nov 11, 2024
917332c
Dockerfile and update train_gpt2.py to most recent record
bluecoconut Nov 12, 2024
0cca2ac
.
bluecoconut Nov 12, 2024
b3c41c7
actually, only do 1 change at once
bluecoconut Nov 12, 2024
ecbc001
replace iframe tag with image link to `Initial D - Deja vu`
dantetemplar Nov 13, 2024
2ba0e19
Merge pull request #25 from bluecoconut/master
KellerJordan Nov 13, 2024
4aedac9
Merge pull request #27 from dantetemplar/master
KellerJordan Nov 13, 2024
5c6f1ba
Update README.md
KellerJordan Nov 13, 2024
6972285
Update README.md
KellerJordan Nov 13, 2024
459bd85
Update README.md
KellerJordan Nov 13, 2024
8179d34
Update README.md
KellerJordan Nov 13, 2024
f01b7ec
Update README.md
KellerJordan Nov 13, 2024
f68ec76
Update README.md
KellerJordan Nov 13, 2024
e2f4af5
.
KellerJordan Nov 14, 2024
8523b88
Merge branch 'master' of https://github.com/KellerJordan/modded-nanogpt
KellerJordan Nov 14, 2024
1f94c1c
.
KellerJordan Nov 14, 2024
13e43e3
Update README.md
KellerJordan Nov 14, 2024
47da85a
Update README.md
KellerJordan Nov 19, 2024
3562d09
update with 11/10/24 record
KellerJordan Nov 20, 2024
494f816
Merge branch 'master' of https://github.com/KellerJordan/modded-nanogpt
KellerJordan Nov 20, 2024
5a5bd12
new 5-minute FlexAttention record by @KoszarskyB
KellerJordan Nov 20, 2024
93ba9fc
.
KellerJordan Nov 20, 2024
a86539c
.
KellerJordan Nov 20, 2024
59c8183
.
KellerJordan Nov 20, 2024
22862e7
.
KellerJordan Nov 20, 2024
abcf16b
.
KellerJordan Nov 20, 2024
2de23fb
.
KellerJordan Nov 20, 2024
0dce01c
.
KellerJordan Nov 20, 2024
0b9d3ca
.
KellerJordan Nov 20, 2024
585a62a
.
KellerJordan Nov 20, 2024
afa58b8
.
KellerJordan Nov 20, 2024
8e642c8
.
KellerJordan Nov 20, 2024
b8c0e58
.
KellerJordan Nov 20, 2024
a3ff237
.
KellerJordan Nov 20, 2024
4344f95
.
KellerJordan Nov 20, 2024
4f71b37
Update README.md
KellerJordan Nov 20, 2024
a82d12c
Update README.md
KellerJordan Nov 20, 2024
1a3594c
Update README.md
KellerJordan Nov 21, 2024
e1a1f87
Update README.md
KellerJordan Nov 21, 2024
17dafe7
Update README.md
KellerJordan Nov 21, 2024
cbc099d
Update README.md
KellerJordan Nov 21, 2024
f92118b
Update train_gpt2.py
KellerJordan Nov 21, 2024
e926fb9
Update README.md
KellerJordan Nov 21, 2024
bb720f2
Create README.md
KellerJordan Nov 21, 2024
7d8ae57
Update README.md
KellerJordan Nov 21, 2024
2e97eab
Update README.md
KellerJordan Nov 22, 2024
ff4f7d6
Update train_gpt2.py
KellerJordan Nov 22, 2024
184edb2
Update README.md
KellerJordan Nov 22, 2024
494cf75
Update README.md
KellerJordan Nov 22, 2024
42aab06
11/24/24 record
KellerJordan Nov 25, 2024
4e853f7
.
KellerJordan Nov 25, 2024
eb52e76
more runs
KellerJordan Nov 25, 2024
3e63ff7
.
KellerJordan Nov 25, 2024
779d11c
Update README.md
KellerJordan Nov 25, 2024
a61f737
Update README.md
KellerJordan Nov 25, 2024
9788090
Update README.md
KellerJordan Nov 25, 2024
3111ab0
Update README.md
KellerJordan Nov 25, 2024
87b34c8
Update README.md
KellerJordan Nov 25, 2024
ee7c9c4
Update README.md
KellerJordan Nov 25, 2024
92a5449
Update README.md
KellerJordan Nov 25, 2024
7a0a3ed
Update README.md
KellerJordan Nov 25, 2024
c0f4f26
Update train_gpt2.py
timlautk Nov 25, 2024
5dcb352
Update train_gpt2.py
timlautk Nov 25, 2024
badc5dc
Merge pull request #31 from timlautk/master
KellerJordan Nov 25, 2024
e6505cd
.
KellerJordan Nov 25, 2024
52acaaa
.
KellerJordan Nov 25, 2024
2685e27
.
KellerJordan Nov 25, 2024
469dc2f
.
KellerJordan Nov 25, 2024
5384d30
.
KellerJordan Nov 25, 2024
beb8368
.
KellerJordan Nov 25, 2024
2b24502
.
KellerJordan Nov 25, 2024
b51050a
Update README.md
KellerJordan Nov 25, 2024
013a194
Merge branch 'master' of https://github.com/KellerJordan/modded-nanogpt
KellerJordan Nov 25, 2024
341c860
.
KellerJordan Nov 26, 2024
c857b1a
.
KellerJordan Nov 26, 2024
b9e4d52
.
KellerJordan Nov 27, 2024
9e35b93
Update train_gpt2.py
KellerJordan Nov 28, 2024
ab5d1a2
run on RTX 4090
lapp0 Nov 29, 2024
a77f7eb
expose training and model parameters as command line arguments
lapp0 Dec 4, 2024
2a0d1ba
update incorrect batch size for 4090 in docs
lapp0 Dec 4, 2024
0b0ca81
print configs once
lapp0 Dec 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
.
  • Loading branch information
KellerJordan committed Nov 20, 2024
commit afa58b8868bb0edfd3de30391cf58a33eb5fc758
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ yeah, those guys doing free labor who everyone constantly musters all of their i
### Speedrun rules

1. Must not modify the train or validation data pipelines (except to change batch size & seqlen; i.e., just the order of the tokens can't be changed).
2. Must use ≤ 124M active parameters per token.
2. Must use ≤ 124M active parameters per token (MoE is OK; & untied embedding matrix only contributes hidden_dim active params).
3. Must attain ≤ 3.28 val loss. A tasteful number would be 3.278 so that [this doesn't happen](./records/110924_Replicateleloykun/1621af10-aa0c-42af-bf54-8a773c63a2af.txt#L3780).

Other than that, go crazy! Anything is fair game
Expand Down