Skip to content

Releases: vtuber-plan/langport

0.3.11

08 Jul 10:03
Compare
Choose a tag to compare

Commits

  • 2b28af1: support sentence transformer (jstzwj)
  • 0028262: response error logging (jstzwj)
  • 84b3e07: embedding worker batch limit (jstzwj)
  • 56e7eb7: AsyncGenerator typing bug fix (jstzwj)
  • f9b77f9: update requirements (jstzwj)

0.3.10

30 Jun 16:15
Compare
Choose a tag to compare

Commits

  • 23e5db2: import chatproto (jstzwj)
  • 0cb8486: remove torch deps (jstzwj)
  • 746b878: remove accelerate deps (jstzwj)
  • 197a672: update test cases (jstzwj)
  • d81c8ba: update embed api (jstzwj)
  • 3dd9b27: stop str bug fix (jstzwj)
  • 101efc3: update qwen1.5 (jstzwj)
  • 69f6e12: log invalid requests (jstzwj)
  • 35c1fe5: update embedding api (jstzwj)

langport 0.3.9

13 Jan 07:56
c462146
Compare
Choose a tag to compare

What's Changed

  • llama stop str bug fix
  • compression trust_remote_code bug fix
  • system prompt bug fix
  • topk and randomness bug fix

langport 0.3.8

06 Nov 14:46
2c5b9e9
Compare
Choose a tag to compare

What's Changed

  • Llama2 prompt quick fix
  • Add prompt test cases

langport 0.3.7

23 Oct 15:19
ee32f73
Compare
Choose a tag to compare

What's Changed

  • Model mistral support
  • Add pydantic version check

langport 0.3.4

15 Sep 06:23
a19cd7f
Compare
Choose a tag to compare

What's Changed

  • GPTQ support
  • Bump the versions of deps

langport 0.3.3

06 Aug 16:11
63a2a34
Compare
Choose a tag to compare

What's Changed

  • Support model Qwen
  • Inference speedup
  • Dynamic Batch Inference
  • Gateway logging

langport 0.3.2

19 Jul 09:11
ccc8d67
Compare
Choose a tag to compare

What's Changed

  • Support 4bit quantization
  • Support InternLM and Llama2
  • Faster 8bit inference

langport 0.3.1

14 Jul 07:43
3f87a74
Compare
Choose a tag to compare

What's Changed

  • Support generation logprobs parameter.
  • Add chunk size and threads args for ggml worker
  • Support NingYu

langport 0.3.0

30 Jun 07:29
1715fa2
Compare
Choose a tag to compare

What's Changed

  • optimum inference support
  • chatglm inference bug fix