Releases: vtuber-plan/langport
Releases · vtuber-plan/langport
0.3.11
0.3.10
Commits
- 23e5db2: import chatproto (jstzwj)
- 0cb8486: remove torch deps (jstzwj)
- 746b878: remove accelerate deps (jstzwj)
- 197a672: update test cases (jstzwj)
- d81c8ba: update embed api (jstzwj)
- 3dd9b27: stop str bug fix (jstzwj)
- 101efc3: update qwen1.5 (jstzwj)
- 69f6e12: log invalid requests (jstzwj)
- 35c1fe5: update embedding api (jstzwj)
langport 0.3.9
What's Changed
- llama stop str bug fix
- compression trust_remote_code bug fix
- system prompt bug fix
- topk and randomness bug fix
langport 0.3.8
What's Changed
- Llama2 prompt quick fix
- Add prompt test cases
langport 0.3.7
What's Changed
- Model mistral support
- Add pydantic version check
langport 0.3.4
What's Changed
- GPTQ support
- Bump the versions of deps
langport 0.3.3
What's Changed
- Support model Qwen
- Inference speedup
- Dynamic Batch Inference
- Gateway logging
langport 0.3.2
What's Changed
- Support 4bit quantization
- Support InternLM and Llama2
- Faster 8bit inference
langport 0.3.1
What's Changed
- Support generation logprobs parameter.
- Add chunk size and threads args for ggml worker
- Support NingYu
langport 0.3.0
What's Changed
- optimum inference support
- chatglm inference bug fix