-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gradio REST API + bash curl always skips the queue #6350
Comments
Hi @zetyquickly yes you are correct. Hitting the We've mitigated this behavior in Gradio 4.x -- now all requests will go through the queue. You can send predictions to the API endpoint using the Python or JS clients, or if you want, using curl, though formatting the requests can be a bit annoying. Please see here for an example: #4932 (comment) Let us know if you have any further questions! |
|
Hey, Python Client demonstrates the same behaviour. Queue is ignored
Outputting log:
|
The following does the the trick of joining the queue:
But! It is processed 1 by 1 in the app |
Just to confirm, you've set |
Yes,
|
Hi, Let me follow up with observations. The doc suggest that the
And the concurrent execution from the comment above produces logs which better resemble concurrent execution of 2 jobs:
Meanwhile, the example app I use seems to fail to parallelize with this set of options. I see the following log of the app:
I understand, that it might be an issue with the app itself, and I'm going to try to do the same "high load" experiment but with But, it's still unclear why |
concurrency_count was deprecated stating in version 4.0! |
I know. Is unexpected behavior is expected? |
Hi @zetyquickly I'm kind of lost in this thread -- is the main issue that the |
I was able to achieve proper queue management on a latest version at that point of time. There was another issue, instance variables of the worker are shared somehow, and opencv accesses the same piece of memory from two concurrent coroutines |
Hi @zetyquickly in that case, I'll close this issue.
Would you be able to create a new issue for this please? |
that issue is also minor, I gave up using gradio anyway. Here is the major show stopper -> #6319 (comment) |
Describe the bug
Hi,
First of all, thanks for such an amazing tool. This issue is following the thread in discord community.
I'm trying to set up an example project utilizing GPU on my premise, but stumble upon the strange erroneous behaviour of the server.
Step 1. Running a demo, like:
Step 2.Calling endpoint in a loop 24 times like this:
The requests are skipping the queue somehow, I see from the log that there are more than 4 jobs running simultaneously.
Any suggestions?
Have you searched existing issues? 🔎
Reproduction
Follow 2 steps in the description and this:
Screenshot
No response
Logs
Severity
Blocking usage of gradio
The text was updated successfully, but these errors were encountered: