Skip to content

Commit

Permalink
Merge branch 'master' into compilade/convert-hf-refactor
Browse files Browse the repository at this point in the history
  • Loading branch information
compilade committed May 3, 2024
2 parents 13f4cf7 + 60325fa commit 6a54973
Show file tree
Hide file tree
Showing 11 changed files with 495 additions and 152 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/close-issue.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ jobs:
steps:
- uses: actions/stale@v5
with:
exempt-issue-labels: "refactor,help wanted,good first issue,research"
exempt-issue-labels: "refactor,help wanted,good first issue,research,bug"
days-before-issue-stale: 30
days-before-issue-close: 14
stale-issue-label: "stale"
Expand Down
2 changes: 1 addition & 1 deletion common/common.h
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ struct gpt_params {
bool multiple_choice = false; // compute TruthfulQA score over random tasks from datafile supplied in prompt
size_t multiple_choice_tasks = 0; // number of tasks to use when computing the TruthfulQA score. If 0, all tasks will be computed

bool kl_divergence = false; // compute KL-divergence
bool kl_divergence = false; // compute KL divergence

bool random_prompt = false; // do not randomize prompt if none provided
bool use_color = false; // use color to distinguish generations and inputs
Expand Down
4 changes: 2 additions & 2 deletions common/log.h
Original file line number Diff line number Diff line change
Expand Up @@ -234,7 +234,7 @@ inline std::string log_filename_generator_impl(LogTriState multilog, const std::
// INTERNAL, DO NOT USE
// USE LOG() INSTEAD
//
#if !defined(_MSC_VER) || defined(__INTEL_LLVM_COMPILER)
#if !defined(_MSC_VER) || defined(__INTEL_LLVM_COMPILER) || defined(__clang__)
#define LOG_IMPL(str, ...) \
do { \
if (LOG_TARGET != nullptr) \
Expand All @@ -257,7 +257,7 @@ inline std::string log_filename_generator_impl(LogTriState multilog, const std::
// INTERNAL, DO NOT USE
// USE LOG_TEE() INSTEAD
//
#if !defined(_MSC_VER) || defined(__INTEL_LLVM_COMPILER)
#if !defined(_MSC_VER) || defined(__INTEL_LLVM_COMPILER) || defined(__clang__)
#define LOG_TEE_IMPL(str, ...) \
do { \
if (LOG_TARGET != nullptr) \
Expand Down
2 changes: 1 addition & 1 deletion examples/main/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -544,7 +544,7 @@ int main(int argc, char ** argv) {
// if we run out of context:
// - take the n_keep first tokens from the original prompt (via n_past)
// - take half of the last (n_ctx - n_keep) tokens and recompute the logits in batches
if (n_past + (int) embd.size() + std::max<int>(0, guidance_offset) > n_ctx) {
if (n_past + (int) embd.size() + std::max<int>(0, guidance_offset) >= n_ctx) {
if (params.n_predict == -2) {
LOG_TEE("\n\n%s: context full and n_predict == -%d => stopping\n", __func__, params.n_predict);
break;
Expand Down
118 changes: 115 additions & 3 deletions examples/perplexity/README.md

Large diffs are not rendered by default.

230 changes: 175 additions & 55 deletions examples/perplexity/perplexity.cpp

Large diffs are not rendered by default.

88 changes: 56 additions & 32 deletions examples/server/tests/features/results.feature
Original file line number Diff line number Diff line change
Expand Up @@ -7,44 +7,16 @@ Feature: Results
And a model file tinyllamas/split/stories15M-00001-of-00003.gguf from HF repo ggml-org/models
And a model file test-model-00001-of-00003.gguf
And 128 as batch size
And 256 KV cache size
And 1024 KV cache size
And 128 max tokens to predict
And continuous batching

Scenario Outline: Multi users completion
Scenario Outline: consistent results with same seed
Given <n_slots> slots
And continuous batching
Then the server is starting
Then the server is healthy

Given 42 as seed
And a prompt:
"""
Write a very long story about AI.
"""

Given 42 as seed
And a prompt:
"""
Write a very long story about AI.
"""

Given 42 as seed
And a prompt:
"""
Write a very long story about AI.
"""

Given 42 as seed
And a prompt:
"""
Write a very long story about AI.
"""

Given 42 as seed
And a prompt:
"""
Write a very long story about AI.
"""
Given 4 prompts "Title: Little Red Riding Hood But In Space\n\nSummary:" with seed 42

Given concurrent completion requests
Then the server is busy
Expand All @@ -55,3 +27,55 @@ Feature: Results
| n_slots |
| 1 |
| 2 |

Scenario Outline: different results with different seed
Given <n_slots> slots
Then the server is starting
Then the server is healthy

Given 1 prompts "Title: Little Red Riding Hood But In Space\n\nSummary:" with seed 42
Given 1 prompts "Title: Little Red Riding Hood But In Space\n\nSummary:" with seed 43
Given 1 prompts "Title: Little Red Riding Hood But In Space\n\nSummary:" with seed 44
Given 1 prompts "Title: Little Red Riding Hood But In Space\n\nSummary:" with seed 45

Given concurrent completion requests
Then the server is busy
Then the server is idle
And all slots are idle
Then all predictions are different
Examples:
| n_slots |
| 1 |
| 2 |

Scenario Outline: consistent results with same seed and varying batch size
Given 4 slots
And <temp> temperature
# And 0 as draft
Then the server is starting
Then the server is healthy

Given 1 prompts "Write a very long story about AI." with seed 42
And concurrent completion requests
# Then the server is busy # Not all slots will be utilized.
Then the server is idle
And all slots are idle

Given <n_parallel> prompts "Write a very long story about AI." with seed 42
And concurrent completion requests
# Then the server is busy # Not all slots will be utilized.
Then the server is idle
And all slots are idle

Then all predictions are equal
Examples:
| n_parallel | temp |
| 1 | 0.0 |
| 2 | 0.0 |
| 4 | 0.0 |
| 1 | 1.0 |
# FIXME: These tests fail on master. The problem seems to be the unified KV cache.
# See https://github.com/ggerganov/whisper.cpp/issues/1941#issuecomment-1986923227
# and https://github.com/ggerganov/llama.cpp/pull/6122#discussion_r1531405574 .
# | 2 | 1.0 |
# | 4 | 1.0 |
Loading

0 comments on commit 6a54973

Please sign in to comment.