Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gradio REST API + bash curl always skips the queue #6350

Closed
1 task done
zetyquickly opened this issue Nov 9, 2023 · 14 comments
Closed
1 task done

Gradio REST API + bash curl always skips the queue #6350

zetyquickly opened this issue Nov 9, 2023 · 14 comments
Labels
bug Something isn't working pending clarification

Comments

@zetyquickly
Copy link

Describe the bug

Hi,

First of all, thanks for such an amazing tool. This issue is following the thread in discord community.

I'm trying to set up an example project utilizing GPU on my premise, but stumble upon the strange erroneous behaviour of the server.

Step 1. Running a demo, like:

demo = gr.Interface(...
  concurrency_limit=4
)
demo.queue(max_size=None)
demo.launch(show_api=True, share=True, debug=True)

Step 2.Calling endpoint in a loop 24 times like this:

...
curl -X POST -H 'Content-type: application/json' --data-binary "@<PAYLOAD_FILE>" "<URL>/run/predict" &>/dev/null &

The requests are skipping the queue somehow, I see from the log that there are more than 4 jobs running simultaneously.

Any suggestions?

Have you searched existing issues? 🔎

  • I have searched and found no existing issues

Reproduction

Follow 2 steps in the description and this:

#!/bin/bash

# Check if a URL argument is provided
if [ "$#" -ne 1 ]; then
    echo "Usage: $0 <base_url>"
    exit 1
fi

# Assign the first argument as the base URL
BASE_URL="$1"

# Construct the full URL by appending /run/predict
FULL_URL="${BASE_URL}/run/predict"

# Convert the image to base64 and store it in a variable
# Ensure the base64 command does not wrap lines by using -w 0
encoded_image=$(base64 -w 0 01.png)

# Save the JSON payload with the base64-encoded image to a temporary file
# Ensure that the JSON is valid and the base64 string is quoted properly
echo "{\"data\": [\"$encoded_image\", true, true, true, 2, 0.7]}" > /tmp/temp_payload.json

# Use curl with --data-binary to read the data from the temporary file
curl -X POST -H 'Content-type: application/json' --data-binary "@/tmp/temp_payload.json" "$FULL_URL"

# Optionally, clean up the temporary file
rm /tmp/temp_payload.json

# curl -X POST -H 'Content-type: application/json' --data '{ "data": ["01.png", true, true, true, 2, 0.7] }' 

Screenshot

No response

Logs

Short log:


This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)
Inp: /tmp/gradio/5a41ccbaf9ac40175393b89cbc2ad5456a8b829c/image.png True True 2.0 0.7
        image size: (838, 726, 3)
        detect 2 faces
Inp: /tmp/gradio/bc39e1d56173191170fa063d2b634a9422f7ed7d/image.png True True 2.0 0.7
        image size: (225, 400, 3)
Inp: /tmp/gradio/619ea5ad0c52a14cb6df0e5fdf6b5bfd82be9cdd/image.png True True 2.0 0.7
Inp: /tmp/gradio/bc39e1d56173191170fa063d2b634a9422f7ed7d/image.png True True 2.0 0.7
        image size: (225, 225, 3)
        image size: (225, 400, 3)
Inp: /tmp/gradio/619ea5ad0c52a14cb6df0e5fdf6b5bfd82be9cdd/image.png True True 2.0 0.7
Inp: /tmp/gradio/bc39e1d56173191170fa063d2b634a9422f7ed7d/image.png True True 2.0 0.7
Inp: /tmp/gradio/619ea5ad0c52a14cb6df0e5fdf6b5bfd82be9cdd/image.png True True 2.0 0.7
Inp: /tmp/gradio/619ea5ad0c52a14cb6df0e5fdf6b5bfd82be9cdd/image.png True True 2.0 0.7
Inp: /tmp/gradio/619ea5ad0c52a14cb6df0e5fdf6b5bfd82be9cdd/image.png True True 2.0 0.7
Inp: /tmp/gradio/bc39e1d56173191170fa063d2b634a9422f7ed7d/image.png True True 2.0 0.7
        image size: (225, 225, 3)
        image size: (225, 400, 3)
        image size: (225, 225, 3)
        image size: (225, 225, 3)
        image size: (225, 225, 3)
        image size: (225, 400, 3)
Inp: /tmp/gradio/bc39e1d56173191170fa063d2b634a9422f7ed7d/image.png True True 2.0 0.7
        image size: (225, 400, 3)
Inp: /tmp/gradio/bc39e1d56173191170fa063d2b634a9422f7ed7d/image.png True True 2.0 0.7
Inp: /tmp/gradio/619ea5ad0c52a14cb6df0e5fdf6b5bfd82be9cdd/image.png True True 2.0 0.7
        image size: (225, 225, 3)
        image size: (225, 400, 3)
Inp: /tmp/gradio/dadb51e9d77dafcf29cc78e579d77194a0a53a3d/image.png True True 2.0 0.7
        image size: (2048, 2048, 3)
        detect 1 faces
        detect 1 faces
        detect 1 faces
Global exception 
Global exception OpenCV(4.8.1) :-1: error: (-5:Bad argument) in function 'resize'
> Overload resolution failed:
>  - src is not a numpy array, neither a scalar
>  - Expected Ptr<cv::UMat> for argument 'src'

        detect 1 faces
        detect 1 faces
        detect 1 faces
        detect 1 faces
        detect 1 faces
        detect 1 faces
        detect 1 faces
        detect 1 faces
        detect 1 faces
Global exception OpenCV(4.8.1) :-1: error: (-5:Bad argument) in function 'resize'
> Overload resolution failed:
>  - src is not a numpy array, neither a scalar
>  - Expected Ptr<cv::UMat> for argument 'src'

Global exception OpenCV(4.8.1) :-1: error: (-5:Bad argument) in function 'resize'
> Overload resolution failed:
>  - src is not a numpy array, neither a scalar
>  - Expected Ptr<cv::UMat> for argument 'src'

Global exception OpenCV(4.8.1) :-1: error: (-5:Bad argument) in function 'resize'
> Overload resolution failed:
>  - src is not a numpy array, neither a scalar
>  - Expected Ptr<cv::UMat> for argument 'src'

Global exception OpenCV(4.8.1) :-1: error: (-5:Bad argument) in function 'resize'
> Overload resolution failed:
>  - src is not a numpy array, neither a scalar
>  - Expected Ptr<cv::UMat> for argument 'src'

Global exception OpenCV(4.8.1) :-1: error: (-5:Bad argument) in function 'resize'
> Overload resolution failed:
>  - src is not a numpy array, neither a scalar
>  - Expected Ptr<cv::UMat> for argument 'src'

Global exception OpenCV(4.8.1) :-1: error: (-5:Bad argument) in function 'resize'
> Overload resolution failed:
>  - src is not a numpy array, neither a scalar
>  - Expected Ptr<cv::UMat> for argument 'src'


### System Info

```shell
gradio==v3.41.2
conda
CPU + Nvidia GPU + torch >= 2.0.0

Severity

Blocking usage of gradio

@zetyquickly zetyquickly added the bug Something isn't working label Nov 9, 2023
@abidlabs
Copy link
Member

abidlabs commented Nov 9, 2023

Hi @zetyquickly yes you are correct. Hitting the /run/predict endpoint does skip the queue in Gradio 3.x.

We've mitigated this behavior in Gradio 4.x -- now all requests will go through the queue.

You can send predictions to the API endpoint using the Python or JS clients, or if you want, using curl, though formatting the requests can be a bit annoying. Please see here for an example: #4932 (comment)

Let us know if you have any further questions!

@zetyquickly
Copy link
Author

@abidlabs

  1. Thanks for the new versions! About Gradio 4.x is it in the main or dev branch?
  2. Will try Python Client

@zetyquickly
Copy link
Author

Hey,

Python Client demonstrates the same behaviour. Queue is ignored

import argparse
import base64
import json
import os
import requests
from concurrent.futures import ThreadPoolExecutor

def send_request(image_file):
    # Convert the image to base64
    with open(image_file, "rb") as image:
        encoded_image = base64.b64encode(image.read()).decode('utf-8')

    # Construct the JSON payload
    payload = json.dumps({
        "data": [encoded_image, True, True, True, 2, 0.7]
    })

    # Send the request
    headers = {'Content-type': 'application/json'}
    response = requests.post(full_url, headers=headers, data=payload)
    print(f"Response from server for image {image_file}: {response.status_code}")

if __name__ == "__main__":
    # Set up argument parser
    parser = argparse.ArgumentParser(description="Send multiple base64-encoded images to a Gradio app for prediction.")
    parser.add_argument('--base_url', help="The base URL of the Gradio app")

    # Parse arguments
    args = parser.parse_args()

    # Construct the full URL by appending /run/predict
    full_url = f"{args.base_url}/run/predict"

    # Define a list of image filenames
    data_path = ".."  # Parent directory
    image_files = [os.path.join(data_path, x) for x in ["01.png", "02.jpg", "03.jpg", "04.jpg"] * 6]  # Multiply the list as needed

    # Use ThreadPoolExecutor to send requests in parallel
    with ThreadPoolExecutor(max_workers=os.cpu_count()) as executor:
        futures = [executor.submit(send_request, image_file) for image_file in image_files]

    # Wait for all requests to complete
    for future in futures:
        future.result()

    print("All requests have been sent.")

Outputting log:

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)
Inp: /tmp/gradio/619ea5ad0c52a14cb6df0e5fdf6b5bfd82be9cdd/image.png True True 2.0 0.7
        image size: (225, 225, 3)
Inp: /tmp/gradio/5a41ccbaf9ac40175393b89cbc2ad5456a8b829c/image.png True True 2.0 0.7
        image size: (838, 726, 3)
Inp: /tmp/gradio/619ea5ad0c52a14cb6df0e5fdf6b5bfd82be9cdd/image.png True True 2.0 0.7
Inp: /tmp/gradio/bc39e1d56173191170fa063d2b634a9422f7ed7d/image.png True True 2.0 0.7
Inp: /tmp/gradio/bc39e1d56173191170fa063d2b634a9422f7ed7d/image.png True True 2.0 0.7
        image size: (225, 225, 3)
        image size: (225, 400, 3)
        image size: (225, 400, 3)
        detect 1 faces
Inp: /tmp/gradio/dadb51e9d77dafcf29cc78e579d77194a0a53a3d/image.png True True 2.0 0.7
Inp: /tmp/gradio/dadb51e9d77dafcf29cc78e579d77194a0a53a3d/image.png True True 2.0 0.7
        detect 2 faces
        image size: (2048, 2048, 3)
        image size: (2048, 2048, 3)
Inp: /tmp/gradio/5a41ccbaf9ac40175393b89cbc2ad5456a8b829c/image.png True True 2.0 0.7
        detect 1 faces
        image size: (838, 726, 3)
        detect 1 faces
        detect 1 faces
Global exception OpenCV(4.8.1) :-1: error: (-5:Bad argument) in function 'resize'
> Overload resolution failed:
>  - src is not a numpy array, neither a scalar
>  - Expected Ptr<cv::UMat> for argument 'src'

Traceback (most recent call last):
  File "/opt/conda/envs/fooocus/lib/python3.10/site-packages/gradio/routes.py", line 488, in run_predict
    output = await app.get_blocks().process_api(
  File "/opt/conda/envs/fooocus/lib/python3.10/site-packages/gradio/blocks.py", line 1434, in process_api
    data = self.postprocess_data(fn_index, result["prediction"], state)
  File "/opt/conda/envs/fooocus/lib/python3.10/site-packages/gradio/blocks.py", line 1335, in postprocess_data
    prediction_value = block.postprocess(prediction_value)
  File "/opt/conda/envs/fooocus/lib/python3.10/site-packages/gradio/components/image.py", line 320, in postprocess
    raise ValueError("Cannot process this value as an Image")
ValueError: Cannot process this value as an Image
Global exception OpenCV(4.8.1) :-1: error: (-5:Bad argument) in function 'resize'
> Overload resolution failed:
>  - src is not a numpy array, neither a scalar
>  - Expected Ptr<cv::UMat> for argument 'src'

Global exception OpenCV(4.8.1) :-1: error: (-5:Bad argument) in function 'resize'
> Overload resolution failed:
>  - src is not a numpy array, neither a scalar
>  - Expected Ptr<cv::UMat> for argument 'src'

Global exception OpenCV(4.8.1) :-1: error: (-5:Bad argument) in function 'resize'
> Overload resolution failed:
>  - src is not a numpy array, neither a scalar
>  - Expected Ptr<cv::UMat> for argument 'src'

Traceback (most recent call last):
  File "/opt/conda/envs/fooocus/lib/python3.10/site-packages/gradio/routes.py", line 488, in run_predict
    output = await app.get_blocks().process_api(
  File "/opt/conda/envs/fooocus/lib/python3.10/site-packages/gradio/blocks.py", line 1434, in process_api
    data = self.postprocess_data(fn_index, result["prediction"], state)
  File "/opt/conda/envs/fooocus/lib/python3.10/site-packages/gradio/blocks.py", line 1335, in postprocess_data
    prediction_value = block.postprocess(prediction_value)
  File "/opt/conda/envs/fooocus/lib/python3.10/site-packages/gradio/components/image.py", line 320, in postprocess
    raise ValueError("Cannot process this value as an Image")
ValueError: Cannot process this value as an Image
...

@zetyquickly
Copy link
Author

The following does the the trick of joining the queue:

import argparse
import os
import asyncio
from gradio_client import Client

# Create a function to run the synchronous submit in an executor
async def submit_request(client, image_file, api_name):
    loop = asyncio.get_running_loop()
    input_data = [image_file, True, True, True, 2, 0.7]  # Replace with actual expected inputs
    # Run the synchronous function in the default executor (a threadpool)
    job = await loop.run_in_executor(None, lambda: client.submit(*input_data, api_name=api_name))
    return job

# No changes needed here as this is already an async function
async def check_job_status(job):
    while not job.done():
        status = job.status()
        print(f"Job Status: {status.code}. Queue position: {status.rank} / {status.queue_size}. Estimated time: {status.eta}")
        await asyncio.sleep(1)  # Use asyncio.sleep for non-blocking sleep
    return job.result()

# Define the main function as an async function
async def main(base_url, image_files):
    client = Client(base_url)
    jobs = []
    for image_file in image_files:
        job = await submit_request(client, image_file, "/predict")
        jobs.append(job)
    results = await asyncio.gather(*[check_job_status(job) for job in jobs])
    return results

# Use asyncio.run to run the main function
if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Send multiple images to a Gradio app for prediction asynchronously.")
    parser.add_argument('--base_url', required=True, help="The base URL of the Gradio app")
    args = parser.parse_args()

    data_path = ".."  # Parent directory
    image_files = [os.path.join(data_path, x) for x in ["01.png", "02.jpg", "03.jpg", "04.jpg"] * 6]

    # Run the main function with asyncio.run
    results = asyncio.run(main(args.base_url, image_files))

    print("All requests have been processed.")
    for result in results:
        print(result)

But! It is processed 1 by 1 in the app

@abidlabs
Copy link
Member

abidlabs commented Nov 9, 2023

But! It is processed 1 by 1 in the app

Just to confirm, you've set concurrency_limit=4 in Interface?

@zetyquickly
Copy link
Author

Yes,

demo = gr.Interface(
    inference, 
    ...
    concurrency_limit=4
)

DEBUG = os.getenv('DEBUG') == '1'
demo.queue(max_size=None)
demo.launch(debug=DEBUG, show_api=True, share=True)

@zetyquickly
Copy link
Author

zetyquickly commented Nov 10, 2023

Hi,

Let me follow up with observations. The doc suggest that the Interface.Queue.concurrency_count option is deprecated, and won't have an effect.
In reality, when app is set this way:

demo = gr.Interface(
    inference, 
    ...
    concurrency_limit=2
)

demo.queue(max_size=None, concurrency_count=2)
demo.launch(debug=DEBUG, show_api=True, share=True)

And the concurrent execution from the comment above produces logs which better resemble concurrent execution of 2 jobs:

Loaded as API: http://127.0.0.1:7860/ ✔
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.STARTING. Queue position: None / None. Estimated time: None
Job Status: Status.PROCESSING. Queue position: None / None. Estimated time: None
Job Status: Status.PROCESSING. Queue position: None / None. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 0 / 7. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 1 / 7. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 3 / 7. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 2 / 7. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 4 / 7. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 5 / 7. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 6 / 7. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 7 / 8. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 8 / 9. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 9 / 10. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 10 / 11. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 11 / 12. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 12 / 13. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 13 / 14. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 14 / 15. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 15 / 16. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 16 / 17. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 17 / 18. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 19 / 20. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 18 / 19. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 20 / 21. Estimated time: None
Job Status: Status.IN_QUEUE. Queue position: 21 / 22. Estimated time: None

Meanwhile, the example app I use seems to fail to parallelize with this set of options. I see the following log of the app:

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)
Inp: /tmp/gradio/619ea5ad0c52a14cb6df0e5fdf6b5bfd82be9cdd/image.png True True 2.0 0.7
        image size: (225, 225, 3)
Inp: /tmp/gradio/5a41ccbaf9ac40175393b89cbc2ad5456a8b829c/image.png True True 2.0 0.7
        image size: (838, 726, 3)
        detect 1 faces
        detect 2 faces
Error Calculated padded input size per channel: (2 x 222). Kernel size: (3 x 3). Kernel size can't be greater than actual input size
Global exception OpenCV(4.8.1) :-1: error: (-5:Bad argument) in function 'resize'
> Overload resolution failed:
>  - src is not a numpy array, neither a scalar
>  - Expected Ptr<cv::UMat> for argument 'src'

Traceback (most recent call last):
  File "/opt/conda/envs/fooocus/lib/python3.10/site-packages/gradio/routes.py", line 488, in run_predict
    output = await app.get_blocks().process_api(
  File "/opt/conda/envs/fooocus/lib/python3.10/site-packages/gradio/blocks.py", line 1434, in process_api
    data = self.postprocess_data(fn_index, result["prediction"], state)
  File "/opt/conda/envs/fooocus/lib/python3.10/site-packages/gradio/blocks.py", line 1335, in postprocess_data
    prediction_value = block.postprocess(prediction_value)
  File "/opt/conda/envs/fooocus/lib/python3.10/site-packages/gradio/components/image.py", line 320, in postprocess
    raise ValueError("Cannot process this value as an Image")
ValueError: Cannot process this value as an Image

I understand, that it might be an issue with the app itself, and I'm going to try to do the same "high load" experiment but with hello_world gradio example.

But, it's still unclear why concurrency_count still actually changes the behaviour of the app on gradio==3.41.2

@freddyaboulton
Copy link
Collaborator

But, it's still unclear why concurrency_count still actually changes the behaviour of the app on gradio==3.41.2

concurrency_count was deprecated stating in version 4.0!

@zetyquickly
Copy link
Author

I know. Is unexpected behavior is expected?

@abidlabs
Copy link
Member

Hi @zetyquickly I'm kind of lost in this thread -- is the main issue that the concurrency_count is still having some effect, whereas it shouldn't because its deprecated. Or is the main issue something else?

@zetyquickly
Copy link
Author

concurrency_count is still having some effect doesn't really matter.

I was able to achieve proper queue management on a latest version at that point of time.

There was another issue, instance variables of the worker are shared somehow, and opencv accesses the same piece of memory from two concurrent coroutines

@abidlabs
Copy link
Member

concurrency_count is still having some effect doesn't really matter.

I was able to achieve proper queue management on a latest version at that point of time.

Hi @zetyquickly in that case, I'll close this issue.

There was another issue, instance variables of the worker are shared somehow, and opencv accesses the same piece of memory from two concurrent coroutines

Would you be able to create a new issue for this please?

@zetyquickly
Copy link
Author

that issue is also minor, I gave up using gradio anyway.

Here is the major show stopper -> #6319 (comment)

@abidlabs
Copy link
Member

Got it thanks for the pointer. Sorry that the frustration caused you to stop using gradio -- we're working on getting these issues resolved as soon as we can, and just opened a PR for #6319 actually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pending clarification
Projects
None yet
Development

No branches or pull requests

3 participants