forked from ray-project/ray
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[serve] collect request metrics at handles (ray-project#42578)
Collect request metrics for autoscaling at handles instead of replicas. This will allow queued metrics to be taken into account instead of just ongoing requests. (3x) DeploymentHandle streaming throughput (ASYNC) (num_replicas=1, tokens_per_request=1000, batch_size=10) | Collect on handles | Original | | --- | --- | | 12140.58 +- 431.94 tokens/s | 12365.42 +- 353.48 tokens/s | | 12119.03 +- 255.58 tokens/s | 12395.43 +- 642.92 tokens/s | | 12168.44 +- 653.06 tokens/s | 12365.76 +- 680.51 tokens/s | (5x) HTTP streaming throughput (num_replicas=1, tokens_per_request=1000, batch_size=10, use_intermediate_deployment=False) | Collect on handles | Original | | --- | --- | | 141118.93 +- 103940.74 tokens/s | 143454.65 +- 103615.47 tokens/s | | 225063.66 +- 5347.35 tokens/s | 228244.37 +- 6209.89 tokens/s | | 225684.1 +- 3262.97 tokens/s | 221354.73 +- 2928.82 tokens/s | | 220755.65 +- 6837.1 tokens/s | 188224.32 +- 78546.09 tokens/s | | 221404.26 +- 3427.73 tokens/s | 223172.25 +- 4064.79 tokens/s | (4x) DeploymentHandle throughput (num_replicas=1, batch_size=100) | Collect on handles | Original | | --- | --- | | 1766.19 +- 15.25 requests/s | 1819.54 +- 5.26 requests/s | | 1760.92 +- 51.87 requests/s | 1762.08 +- 21.04 requests/s | | 1796.22 +- 10.52 requests/s | 1750.08 +- 31.73 requests/s | | 1788.1 +- 29.98 requests/s | 1779.63 +- 24.86 requests/s | Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com> Streaming HTTP throughput
- Loading branch information
Showing
9 changed files
with
273 additions
and
51 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.