Skip to content

Actions: huggingface/text-generation-inference

Server Tests

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
2,369 workflow runs
2,369 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

feat(server): auto max_batch_total_tokens for flash att models
Server Tests #691: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 16:06 12m 59s feat/automatic_max
July 18, 2023 16:06 12m 59s
feat(server): add support for llamav2
Server Tests #690: Pull request #633 opened by Narsil
July 18, 2023 16:04 23m 14s llamav2_post
July 18, 2023 16:04 23m 14s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #689: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 15:03 9m 39s feat/automatic_max
July 18, 2023 15:03 9m 39s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #688: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 14:38 10m 45s feat/automatic_max
July 18, 2023 14:38 10m 45s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #687: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 14:19 11m 31s feat/automatic_max
July 18, 2023 14:19 11m 31s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #686: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 14:11 7m 56s feat/automatic_max
July 18, 2023 14:11 7m 56s
feat(server): flash attention v2
Server Tests #685: Pull request #624 synchronize by OlivierDehaene
July 18, 2023 13:29 13m 11s feat/flash_v2
July 18, 2023 13:29 13m 11s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #684: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 10:46 10m 21s feat/automatic_max
July 18, 2023 10:46 10m 21s
feat(server): flash attention v2
Server Tests #683: Pull request #624 synchronize by OlivierDehaene
July 18, 2023 10:36 15m 28s feat/flash_v2
July 18, 2023 10:36 15m 28s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #682: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 10:04 12m 14s feat/automatic_max
July 18, 2023 10:04 12m 14s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #681: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 09:43 20m 17s feat/automatic_max
July 18, 2023 09:43 20m 17s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #680: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 09:41 2m 21s feat/automatic_max
July 18, 2023 09:41 2m 21s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #679: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 09:39 2m 15s feat/automatic_max
July 18, 2023 09:39 2m 15s
feat(server): Add bitsandbytes 4bit quantization
Server Tests #677: Pull request #626 opened by krzim
July 17, 2023 21:55 14m 48s krzim:bnb-4bit
July 17, 2023 21:55 14m 48s
feat(server): flash attention v2
Server Tests #676: Pull request #624 synchronize by OlivierDehaene
July 17, 2023 16:39 24m 37s feat/flash_v2
July 17, 2023 16:39 24m 37s
feat(server): flash attention v2
Server Tests #675: Pull request #624 opened by OlivierDehaene
July 17, 2023 16:38 56s feat/flash_v2
July 17, 2023 16:38 56s
fea(launcher): debug logs
Server Tests #674: Pull request #623 opened by OlivierDehaene
July 17, 2023 16:38 33m 42s feat/debug_logging
July 17, 2023 16:38 33m 42s
fix(launcher): Rename b-float16 to bfloat16 in the launcher arg
Server Tests #673: Pull request #621 opened by Narsil
July 17, 2023 12:35 20m 3s rename_bf16_arg
July 17, 2023 12:35 20m 3s
fix: LlamaTokenizerFast to AutoTokenizer at flash_llama.py
Server Tests #670: Pull request #619 opened by dongs0104
July 16, 2023 12:17 14m 18s dongs0104:patch-2
July 16, 2023 12:17 14m 18s
Directly load GPTBigCode to specified device
Server Tests #669: Pull request #618 opened by Atry
July 15, 2023 07:33 14m 14s Atry:patch-8
July 15, 2023 07:33 14m 14s
v0.9.2
Server Tests #667: Pull request #616 opened by OlivierDehaene
July 14, 2023 13:39 23m 49s v0.9.2
July 14, 2023 13:39 23m 49s
fix(server): blacklist local files
Server Tests #666: Pull request #609 opened by OlivierDehaene
July 13, 2023 16:57 28m 33s fix/blacklist_local_files
July 13, 2023 16:57 28m 33s
feat(router): explicit warning if revision is not set
Server Tests #665: Pull request #608 opened by OlivierDehaene
July 13, 2023 16:49 23m 57s feat/warning_revision
July 13, 2023 16:49 23m 57s
Add exllama GPTQ CUDA kernel support
Server Tests #664: Pull request #553 synchronize by fxmarty
July 13, 2023 15:49 7m 6s fxmarty:gptq-cuda-kernels
July 13, 2023 15:49 7m 6s
ProTip! You can narrow down the results and go further in time using created:<2023-07-13 or the other filters available.