From 46b8a22365f09db9ae0a3a1a1374487a4e2a1365 Mon Sep 17 00:00:00 2001 From: PENG Bo <33809201+BlinkDL@users.noreply.github.com> Date: Wed, 11 Dec 2024 13:32:41 +0800 Subject: [PATCH] Update README.md --- README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 01f23f70..4ef83978 100644 --- a/README.md +++ b/README.md @@ -14,9 +14,11 @@ RWKV-6 3B Demo: https://huggingface.co/spaces/BlinkDL/RWKV-Gradio-1 RWKV-6 7B Demo: https://huggingface.co/spaces/BlinkDL/RWKV-Gradio-2 -**RWKV-6 GPT-mode demo code (with comments and explanations)**: https://github.com/BlinkDL/RWKV-LM/blob/main/RWKV-v5/rwkv_v6_demo.py +Chat demo for developers: https://github.com/BlinkDL/ChatRWKV/blob/main/API_DEMO_CHAT.py -RWKV-6 RNN-mode demo: https://github.com/BlinkDL/ChatRWKV/blob/main/RWKV_v6_demo.py +And: https://github.com/BlinkDL/RWKV-LM/blob/main/RWKV-v5/rwkv_v6_demo.py + +And: https://github.com/BlinkDL/ChatRWKV/blob/main/RWKV_v6_demo.py ![MQAR](Research/RWKV-6-MQAR.png) @@ -65,8 +67,6 @@ advanced: repeat your SFT data 3 or 4 times in your jsonl (note make_data.py wil lm_eval: https://github.com/BlinkDL/ChatRWKV/blob/main/run_lm_eval.py -chat demo for developers: https://github.com/BlinkDL/ChatRWKV/blob/main/API_DEMO_CHAT.py - **Tips for small model / small data**: When I train RWKV music models, I use deep & narrow (such as L29-D512) dimensions, and apply wd and dropout (such as wd=2 dropout=0.02). Note RWKV-LM dropout is very effective - use 1/4 of your usual value. ### HOW TO FINETUNE RWKV-5 MODELS ###