Skip to content

Conversation

@zhyncs
Copy link
Member

@zhyncs zhyncs commented Feb 8, 2025

Motivation

Modifications

Checklist

  • Format your code according to the Code Formatting with Pre-Commit.
  • Add unit tests as outlined in the Running Unit Tests.
  • Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
  • Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
  • For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.

@zhyncs
Copy link
Member Author

zhyncs commented Feb 8, 2025

All EAGLE 2-related CIs have passed, and these are the other local test commands.

python3 -m sglang.launch_server --model meta-llama/Llama-2-7b-chat-hf
python3 -m sglang.launch_server --model meta-llama/Llama-2-7b-chat-hf  --speculative-algo EAGLE --speculative-draft lmzheng/sglang-EAGLE-llama2-chat-7B --speculative-num-steps 5 --speculative-eagle-topk 8 --speculative-num-draft-tokens 64 --mem-fraction 0.7
for i in {1..2}; do
  curl -s -X POST http://localhost:30000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
      "model": "test",
      "messages": [
        {"role": "system", "content": "You know mathematics."},
        {"role": "user", "content": "What is 1 + 1? Answer with just the number as the only word. Do not put whitespace in front."}
      ],
      "temperature": 0,
      "max_tokens": 32
  }' | jq '.choices[0].message.content'
done
for i in {1..2}; do
  curl -s -X POST http://localhost:30000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
      "model": "test",
      "messages": [
        {"role": "system", "content": "You know mathematics."},
        {"role": "user", "content": "What is 1 + 1? Answer with just the number as the only word. Do not put whitespace in front."}
      ],
      "temperature": 0.6,
      "max_tokens": 32
  }' | jq '.choices[0].message.content'
done

@zhyncs zhyncs added bug Something isn't working high priority labels Feb 8, 2025
@zhyncs
Copy link
Member Author

zhyncs commented Feb 8, 2025

TODOs
#3395 @zhyncs
#3409 @Ying1123

@Ying1123 Ying1123 force-pushed the zhyncs/fix branch 2 times, most recently from a0067ce to 237f89a Compare February 8, 2025 21:32
@zhyncs zhyncs merged commit fad315c into main Feb 8, 2025
21 checks passed
@zhyncs zhyncs deleted the zhyncs/fix branch February 8, 2025 23:28
@zhyncs zhyncs mentioned this pull request Feb 10, 2025
13 tasks
timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working high priority

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants