Skip to content

Commit e50545c

Browse files
authored
Add note for vocab size large than 65536
1 parent 6aed6ad commit e50545c

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,9 @@ Note that the RAYON_NUM_THREADS environment variable control the maximum number
113113
## Other Models and Datastore
114114
In the examples above, we default to use Vicuna and CodeLlama. But actually you can use any LLaMA-based models you like by simply changing the "--model-path" argument. You can also build the datastore from any data you like. If you want to use architectures other than LLaMA, you can also modify the file model/modeling_llama_kv.py to match the corresponding model.
115115

116+
Note: For models with a vocab size larger than 65535 (range of u16), you may change [this line in writer](https://github.com/FasterDecoding/REST/blob/main/DraftRetriever/src/lib.rs#L117) from `self.index_file.write_u16::<LittleEndian>(item as u16)?;` to `self.index_file.write_u32::<LittleEndian>(item as u32)?;`
117+
Besides, change [this line in Reader](https://github.com/FasterDecoding/REST/blob/main/DraftRetriever/src/lib.rs#L192) from `let int = LittleEndian::read_u16(&data_u8[i..i+2]) as i32;` to `let int = LittleEndian::read_u32(&data_u8[i..i+4]) as i32;`
118+
116119
## Citation
117120
```
118121
@misc{he2023rest,

0 commit comments

Comments
 (0)