Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Queue size increases indefinitely #2192

Closed
2 of 4 tasks
QLutz opened this issue Jul 5, 2024 · 3 comments
Closed
2 of 4 tasks

Queue size increases indefinitely #2192

QLutz opened this issue Jul 5, 2024 · 3 comments
Labels

Comments

@QLutz
Copy link

QLutz commented Jul 5, 2024

System Info

OS version: Linux
Model being used (curl 127.0.0.1:8080/info | jq): TheBloke/Nous-Hermes-2-Mixtral-8x7B-DPO-AWQ
Hardware used (GPUs, how many, on which cloud) (nvidia-smi): 1xL40S
The current version being used: 2.0.4

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

Launch TGI using max_total_tokens=max_batch_prefill_tokens=16384; max_input_length=16383; quantize=awq.
After making a few hundred requests, the pod returns empty packets and only a few seconds after the request a
has been made.
Monitoring reveals that tgi_queue_size increases steadily but does not ever go down.

Expected behavior

No stutters.

@Hugoch
Copy link
Member

Hugoch commented Jul 8, 2024

Hey @QLutz , I suspect it may be related to #2099. Can you try to run TGI with --cuda-graphs 0 see if you still see the hang?

@HoKim98
Copy link

HoKim98 commented Jul 11, 2024

I had the same problem, and was able to solve it by trying --cuda-graphs 0 method. This obviously caused major performance problems, but it was at least a better option than being broken.

@HoKim98 HoKim98 mentioned this issue Jul 11, 2024
4 tasks
Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Aug 11, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants