Merge remote-tracking branch 'origin/main' into gqa

LahiRumesh · Aug 17, 2023 · cc5a98a · cc5a98a
2 parents d8e83e3 + 129b84a
commit cc5a98a
Show file tree

Hide file tree

Showing 4 changed files with 25 additions and 27 deletions.
diff --git a/Cargo.lock b/Cargo.lock
diff --git a/README.md b/README.md
@@ -113,40 +113,38 @@ Bindings for this library are available in the following languages:
 
 The easiest way to get started with `llm-cli` is to download a pre-built
 executable from a [released](https://github.com/rustformers/llm/releases)
-version of `llm`, although this may not have all the features present on the
-`main` branch. The following methods involve building `llm`, which requires Rust
-v1.65.0 or above and a modern C toolchain.
+version of `llm`, but the releases are currently out of date and we recommend
+you [install from source](#installing-from-source) instead.
 
-### Installing with `cargo`
+### Installing from Source
 
-To install the most recently released version of `llm` to your Cargo `bin`
+To install the `main` branch of `llm` with the most recent features to your Cargo `bin`
 directory, which `rustup` is likely to have added to your `PATH`, run:
 
 ```shell
-cargo install llm-cli
+cargo install --git https://github.com/rustformers/llm llm-cli
 ```
 
-The CLI application can then be run through `llm`.
+The CLI application can then be run through `llm`. See also [features](#features) and
+[acceleration support](doc/acceleration-support.md) to turn features on as required.
+Note that GPU support (CUDA, OpenCL, Metal) will not work unless you build with the relevant feature.
 
-### Building from Source
-
-To make use of the features on the `main` branch, clone the repository and then
-build it with
-
-```shell
-git clone --recurse-submodules https://github.com/rustformers/llm
-cd llm
-cargo build --release
-```
+### Installing with `cargo`
 
-The resulting binary will be at `target/release/llm[.exe]`.
+Note that the currently published version is out of date and does not include
+support for the most recent models. We currently recommend that you
+[install from source](#installing-from-source).
 
-It can also be run directly through Cargo, with
+To install the most recently released version of `llm` to your Cargo `bin`
+directory, which `rustup` is likely to have added to your `PATH`, run:
 
 ```shell
-cargo run --release -- $ARGS
+cargo install llm-cli
 ```
 
+The CLI application can then be run through `llm`. See also [features](#features)
+to turn features on as required.
+
 ### Features
 
 By default, `llm` builds with support for remotely fetching the tokenizer from Hugging Face's model hub.
@@ -158,7 +156,7 @@ To disable this, disable the default features for the build:
 cargo build --release --no-default-features
 ```
 
-To enable hardware acceleration, see [Acceleration Support for Building section](doc/CONTRIBUTING.md#acceleration-support-for-building), which is also applicable to the CLI.
+To enable hardware acceleration, see [Acceleration Support for Building section](doc/acceleration-support.md), which is also applicable to the CLI.
 
 ## Getting Models
 

diff --git a/crates/llm-base/Cargo.toml b/crates/llm-base/Cargo.toml
@@ -22,7 +22,7 @@ partial_sort = "0.2.0"
 serde_bytes = "0.11"
 memmap2 = { workspace = true }
 half = "2"
-tokenizers = {version="0.13.3", default-features=false, features=["onig"]}
+tokenizers = {version="0.13.4", default-features=false, features=["onig"]}
 regex = "1.8"
 tracing = { workspace = true }
 

diff --git a/crates/llm-base/src/tokenizer/huggingface.rs b/crates/llm-base/src/tokenizer/huggingface.rs
@@ -22,7 +22,7 @@ impl HuggingFaceTokenizer {
     /// Converts a token index to the token it represents in this tokenizer.
     pub(crate) fn token(&self, idx: usize) -> Vec<u8> {
         self.tokenizer
-            .decode(vec![idx as u32], true)
+            .decode(&[idx as u32], true)
             .expect("Cannot decode token from tokenizer tokenizer.")
             .as_bytes()
             .to_vec()
@@ -67,7 +67,7 @@ impl HuggingFaceTokenizer {
     /// Decode a list `tokens` with this tokenizer.
     pub(crate) fn decode(&self, tokens: Vec<TokenId>, skip_special_tokens: bool) -> Vec<u8> {
         self.tokenizer
-            .decode(tokens, skip_special_tokens)
+            .decode(&tokens, skip_special_tokens)
             .expect("Cannot decode token from tokenizer.")
             .as_bytes()
             .to_vec()