Skip to content

gc-fu/FastChat-bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FastChat Benchmark

This repo contains the files that are used for FastChat benchmark.

Start service

You can either start the service using docker or kubernetes, check the README.md in kubernetes folder for detailed instruction.

As for docker, it is quite obvious on how to start the service.

Environment variable settings

For testing performance on CPU, we recommend to use Intel-OpenMP and tcmalloc for acceleration.

source bigdl-nano-init -t
export OMP_NUM_THREADS="YOUR_CORE_NUMBERS"

Test

We use wrk for testing end-to-end throughput, check here.

Please change the test url accordingly.

If you have multiple worker containers/pods, set t/c to the number of backend workers to test full throughput.

# For testing completions
 wrk -t1 -c1 -d20m -s ./wrk-scripts/compl.lua http://172.168.0.218:8000/v1/completions --timeout 1m

# For testing chat completions
wrk -t1 -c1 -d20m -s ./wrk-scripts/chat.lua http://172.168.0.218:8000/v1/chat/completions --timeout 1m

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published