Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
BlinkDL authored Dec 11, 2024
1 parent 20a6105 commit 46b8a22
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,11 @@ RWKV-6 3B Demo: https://huggingface.co/spaces/BlinkDL/RWKV-Gradio-1

RWKV-6 7B Demo: https://huggingface.co/spaces/BlinkDL/RWKV-Gradio-2

**RWKV-6 GPT-mode demo code (with comments and explanations)**: https://github.com/BlinkDL/RWKV-LM/blob/main/RWKV-v5/rwkv_v6_demo.py
Chat demo for developers: https://github.com/BlinkDL/ChatRWKV/blob/main/API_DEMO_CHAT.py

RWKV-6 RNN-mode demo: https://github.com/BlinkDL/ChatRWKV/blob/main/RWKV_v6_demo.py
And: https://github.com/BlinkDL/RWKV-LM/blob/main/RWKV-v5/rwkv_v6_demo.py

And: https://github.com/BlinkDL/ChatRWKV/blob/main/RWKV_v6_demo.py

![MQAR](Research/RWKV-6-MQAR.png)

Expand Down Expand Up @@ -65,8 +67,6 @@ advanced: repeat your SFT data 3 or 4 times in your jsonl (note make_data.py wil

lm_eval: https://github.com/BlinkDL/ChatRWKV/blob/main/run_lm_eval.py

chat demo for developers: https://github.com/BlinkDL/ChatRWKV/blob/main/API_DEMO_CHAT.py

**Tips for small model / small data**: When I train RWKV music models, I use deep & narrow (such as L29-D512) dimensions, and apply wd and dropout (such as wd=2 dropout=0.02). Note RWKV-LM dropout is very effective - use 1/4 of your usual value.

### HOW TO FINETUNE RWKV-5 MODELS ###
Expand Down

0 comments on commit 46b8a22

Please sign in to comment.