Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port of self extension to server #5104

Merged
merged 18 commits into from
Jan 27, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Added description to server readme.
  • Loading branch information
Maximilian-Winter committed Jan 26, 2024
commit 4df0e88aedb010c04aed8e06de220463c1d7b044
4 changes: 3 additions & 1 deletion examples/server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,9 @@ Command line options:
- `-cb`, `--cont-batching`: enable continuous batching (a.k.a dynamic batching) (default: disabled)
- `-spf FNAME`, `--system-prompt-file FNAME` Set a file to load "a system prompt (initial prompt of all slots), this is useful for chat applications. [See more](#change-system-prompt-on-runtime)
- `--mmproj MMPROJ_FILE`: Path to a multimodal projector file for LLaVA.

- `--grp-attn-n`: Extend context size through self extend. Extend context size n-times (default: 1), used together with `--grp-attn-w`
- `--grp-attn-w`: Width of the self extend context size extension. (default: 512) shouldn't be greater than original context size
-
## Build

server is build alongside everything else from the root of the project
Expand Down
2 changes: 2 additions & 0 deletions examples/server/server.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1810,6 +1810,8 @@ static void server_print_usage(const char *argv0, const gpt_params &params,
printf(" --override-kv KEY=TYPE:VALUE\n");
printf(" advanced option to override model metadata by key. may be specified multiple times.\n");
printf(" types: int, float, bool. example: --override-kv tokenizer.ggml.add_bos_token=bool:false\n");
printf(" --grp-attn-n N Extend context size through self extend. Extend context size n-times (default: 1), used together with `--grp-attn-w`");
printf(" --grp-attn-w N Width of the self extend context size extension. (default: 512) shouldn't be greater than original context size");
printf("\n");
}

Expand Down
Loading