-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Various enhancements to the Gradio Ray Serve tutorial #50276
base: master
Are you sure you want to change the base?
Conversation
* adds missing pip requirements * improves the overall narrative flow of the tutorial * clarifies that there are 2 separate examples within the same page * clarifies the difference between the two methods Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
@zcin to review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some style nits.
## Quickstart: Deploy your Gradio app with Ray Serve | ||
## Example 1: Scaling up your Gradio app with `GradioServer` | ||
|
||
The first example summarizes text using the [T5 Small](https://huggingface.co/t5-small) model and uses [Hugging Face's Pipelines](https://huggingface.co/docs/transformers/main_classes/pipelines) to access that model. It demonstrates one easy way to deploy Gradio apps onto Ray Serve: using the simple `GradioServer` wrapper. Later, in example 2, we will show how to use `GradioIngress` for more customized use-cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first example summarizes text using the [T5 Small](https://huggingface.co/t5-small) model and uses [Hugging Face's Pipelines](https://huggingface.co/docs/transformers/main_classes/pipelines) to access that model. It demonstrates one easy way to deploy Gradio apps onto Ray Serve: using the simple `GradioServer` wrapper. Later, in example 2, we will show how to use `GradioIngress` for more customized use-cases. | |
The first example summarizes text using the [T5 Small](https://huggingface.co/t5-small) model and uses [Hugging Face's Pipelines](https://huggingface.co/docs/transformers/main_classes/pipelines) to access that model. It demonstrates one easy way to deploy Gradio apps onto Ray Serve: using the simple `GradioServer` wrapper. Example 2 shows how to use `GradioIngress` for more customized use-cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid first person and future tense when possible.
```{literalinclude} ../doc_code/gradio-integration.py | ||
:start-after: __doc_gradio_app_begin__ | ||
:end-before: __doc_gradio_app_end__ | ||
``` | ||
|
||
### Deploying Gradio Server | ||
In order to deploy your Gradio app onto Ray Serve, you need to wrap your Gradio app in a Serve [deployment](serve-key-concepts-deployment). `GradioServer` acts as that wrapper. It serves your Gradio app remotely on Ray Serve so that it can process and respond to HTTP requests. | ||
In order to deploy your Gradio app onto Ray Serve, you need to wrap your Gradio app in a Serve [deployment](serve-key-concepts-deployment). `GradioServer` acts as that wrapper. It serves your Gradio app remotely on Ray Serve so that it can process and respond to HTTP requests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to deploy your Gradio app onto Ray Serve, you need to wrap your Gradio app in a Serve [deployment](serve-key-concepts-deployment). `GradioServer` acts as that wrapper. It serves your Gradio app remotely on Ray Serve so that it can process and respond to HTTP requests. | |
To deploy your Gradio app onto Ray Serve, you need to wrap the Gradio app in a Serve [deployment](serve-key-concepts-deployment). `GradioServer` acts as that wrapper. It serves your Gradio app remotely on Ray Serve so that it can process and respond to HTTP requests. |
@@ -59,7 +63,7 @@ Using either the Gradio app `io`, which the builder function constructed, or you | |||
:end-before: __doc_app_end__ | |||
``` | |||
|
|||
Finally, deploy your Gradio Server. Run the following in your terminal: | |||
Finally, deploy your Gradio Server. Run the following in your terminal (assuming that you saved the file as `demo.py`): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally, deploy your Gradio Server. Run the following in your terminal (assuming that you saved the file as `demo.py`): | |
Finally, deploy your Gradio Server. Run the following in your terminal, assuming that you saved the file as `demo.py`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid excessive use of parentheses.
Suppose you want to run the following program. | ||
|
||
1. Take two text generation models, [`gpt2`](https://huggingface.co/gpt2) and [`distilgpt2`](https://huggingface.co/distilgpt2). | ||
2. Run the two models on the same input text, so that the generated text has a minimum length of 20 and maximum length of 100. | ||
3. Display the outputs of both models using Gradio. | ||
|
||
Let's compare an unparallelized approach using vanilla Gradio to a parallelized approach using Ray Serve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's compare an unparallelized approach using vanilla Gradio to a parallelized approach using Ray Serve. | |
The following is a comparison of an unparallelized approach using vanilla Gradio to a parallelized approach using Ray Serve. |
Why are these changes needed?
Related issue number
N/A
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.