Skip to content

Releases: huggingface/text-generation-inference

v1.3.4

22 Dec 14:46
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.3.3...v1.3.4

v1.3.3

15 Dec 00:22
Compare
Choose a tag to compare

What's Changed

  • fix gptq params loading
  • improve decode latency for long sequences two fold
  • feat: add more latency metrics in forward by @OlivierDehaene in #1346
  • fix: max_past default value must be -1, not 0 by @OlivierDehaene in #1348

Full Changelog: v1.3.2...v1.3.3

v1.3.2

12 Dec 17:14
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.3.1...v1.3.2

v1.3.1

11 Dec 15:47
Compare
Choose a tag to compare

Hotfix Mixtral implementation

Full Changelog: v1.3.0...v1.3.1

v1.3.0

11 Dec 14:11
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.2.0...v1.3.0

v.1.2.0

30 Nov 14:19
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.1.1...v1.2.0

v1.1.1

16 Nov 17:37
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.1.0...v1.1.1

v1.1.0

28 Sep 08:34
Compare
Choose a tag to compare

Notable changes

  • Support for Mistral models (#1071)
  • AWQ quantization (#1019)
  • EETQ quantization (#1068)

What's Changed

New Contributors

Full Changelog: v1.0.3...v1.1.0

v1.0.3

29 Aug 12:29
5485c14
Compare
Choose a tag to compare

What's Changed

Codellama.

Full Changelog: v1.0.2...v1.0.3

v1.0.2

23 Aug 10:55
c4422e5
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.0.1...v1.0.2