[serve] Replace `uuid.uuid4()` with `getrandbits` #49537

edoakes · 2025-01-02T15:18:51Z

Why are these changes needed?

Reduces CPU overhead (particularly on the proxy). This is less cryptographically secure but should be OK for our use case.

App:

from ray import serve

@serve.deployment(
    max_ongoing_requests=100,
    num_replicas=16,
    ray_actor_options={"num_cpus": 0},
)
class A:
    def __call__(self):
        return b"hi"

app = A.bind()

Benchmark:

ab -n 10000 -c 100 http://127.0.0.1:8000/

Before (~780 qps):

Concurrency Level:      100
Time taken for tests:   12.747 seconds
Complete requests:      10000
Failed requests:        0
Total transferred:      1910000 bytes
HTML transferred:       120000 bytes
Requests per second:    784.47 [#/sec] (mean)
Time per request:       127.475 [ms] (mean)
Time per request:       1.275 [ms] (mean, across all concurrent requests)
Transfer rate:          146.32 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.6      0      21
Processing:     5  127  35.7    127     305
Waiting:        3  125  35.8    126     304
Total:          5  127  35.6    128     306

Percentage of the requests served within a certain time (ms)
  50%    128
  66%    138
  75%    147
  80%    153
  90%    170
  95%    188
  98%    210
  99%    224
 100%    306 (longest request)

After (~820 qps):

Concurrency Level:      100
Time taken for tests:   12.130 seconds
Complete requests:      10000
Failed requests:        0
Total transferred:      1910000 bytes
HTML transferred:       120000 bytes
Requests per second:    824.44 [#/sec] (mean)
Time per request:       121.295 [ms] (mean)
Time per request:       1.213 [ms] (mean, across all concurrent requests)
Transfer rate:          153.78 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.5      0       4
Processing:     6  121  30.1    124     230
Waiting:        4  119  30.2    123     228
Total:          7  121  30.0    124     230

Percentage of the requests served within a certain time (ms)
  50%    124
  66%    132
  75%    138
  80%    144
  90%    157
  95%    167
  98%    181
  99%    189
 100%    230 (longest request)

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>

GeneDer

## Why are these changes needed? Reduces CPU overhead (particularly on the proxy). This is less cryptographically secure but should be OK for our use case. App: ```python from ray import serve @serve.deployment( max_ongoing_requests=100, num_replicas=16, ray_actor_options={"num_cpus": 0}, ) class A: def __call__(self): return b"hi" app = A.bind() ``` Benchmark: ``` ab -n 10000 -c 100 http://127.0.0.1:8000/ ``` Before (~780 qps): ``` Concurrency Level: 100 Time taken for tests: 12.747 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 1910000 bytes HTML transferred: 120000 bytes Requests per second: 784.47 [#/sec] (mean) Time per request: 127.475 [ms] (mean) Time per request: 1.275 [ms] (mean, across all concurrent requests) Transfer rate: 146.32 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.6 0 21 Processing: 5 127 35.7 127 305 Waiting: 3 125 35.8 126 304 Total: 5 127 35.6 128 306 Percentage of the requests served within a certain time (ms) 50% 128 66% 138 75% 147 80% 153 90% 170 95% 188 98% 210 99% 224 100% 306 (longest request) ``` After (~820 qps): ``` Concurrency Level: 100 Time taken for tests: 12.130 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 1910000 bytes HTML transferred: 120000 bytes Requests per second: 824.44 [#/sec] (mean) Time per request: 121.295 [ms] (mean) Time per request: 1.213 [ms] (mean, across all concurrent requests) Transfer rate: 153.78 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.5 0 4 Processing: 6 121 30.1 124 230 Waiting: 4 119 30.2 123 228 Total: 7 121 30.0 124 230 Percentage of the requests served within a certain time (ms) 50% 124 66% 132 75% 138 80% 144 90% 157 95% 167 98% 181 99% 189 100% 230 (longest request) ``` ## Related issue number  ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>

## Why are these changes needed? Reduces CPU overhead (particularly on the proxy). This is less cryptographically secure but should be OK for our use case. App: ```python from ray import serve @serve.deployment( max_ongoing_requests=100, num_replicas=16, ray_actor_options={"num_cpus": 0}, ) class A: def __call__(self): return b"hi" app = A.bind() ``` Benchmark: ``` ab -n 10000 -c 100 http://127.0.0.1:8000/ ``` Before (~780 qps): ``` Concurrency Level: 100 Time taken for tests: 12.747 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 1910000 bytes HTML transferred: 120000 bytes Requests per second: 784.47 [#/sec] (mean) Time per request: 127.475 [ms] (mean) Time per request: 1.275 [ms] (mean, across all concurrent requests) Transfer rate: 146.32 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.6 0 21 Processing: 5 127 35.7 127 305 Waiting: 3 125 35.8 126 304 Total: 5 127 35.6 128 306 Percentage of the requests served within a certain time (ms) 50% 128 66% 138 75% 147 80% 153 90% 170 95% 188 98% 210 99% 224 100% 306 (longest request) ``` After (~820 qps): ``` Concurrency Level: 100 Time taken for tests: 12.130 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 1910000 bytes HTML transferred: 120000 bytes Requests per second: 824.44 [#/sec] (mean) Time per request: 121.295 [ms] (mean) Time per request: 1.213 [ms] (mean, across all concurrent requests) Transfer rate: 153.78 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.5 0 4 Processing: 6 121 30.1 124 230 Waiting: 4 119 30.2 123 228 Total: 7 121 30.0 124 230 Percentage of the requests served within a certain time (ms) 50% 124 66% 132 75% 138 80% 144 90% 157 95% 167 98% 181 99% 189 100% 230 (longest request) ``` ## Related issue number  ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>

## Why are these changes needed? Reduces CPU overhead (particularly on the proxy). This is less cryptographically secure but should be OK for our use case. App: ```python from ray import serve @serve.deployment( max_ongoing_requests=100, num_replicas=16, ray_actor_options={"num_cpus": 0}, ) class A: def __call__(self): return b"hi" app = A.bind() ``` Benchmark: ``` ab -n 10000 -c 100 http://127.0.0.1:8000/ ``` Before (~780 qps): ``` Concurrency Level: 100 Time taken for tests: 12.747 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 1910000 bytes HTML transferred: 120000 bytes Requests per second: 784.47 [#/sec] (mean) Time per request: 127.475 [ms] (mean) Time per request: 1.275 [ms] (mean, across all concurrent requests) Transfer rate: 146.32 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.6 0 21 Processing: 5 127 35.7 127 305 Waiting: 3 125 35.8 126 304 Total: 5 127 35.6 128 306 Percentage of the requests served within a certain time (ms) 50% 128 66% 138 75% 147 80% 153 90% 170 95% 188 98% 210 99% 224 100% 306 (longest request) ``` After (~820 qps): ``` Concurrency Level: 100 Time taken for tests: 12.130 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 1910000 bytes HTML transferred: 120000 bytes Requests per second: 824.44 [#/sec] (mean) Time per request: 121.295 [ms] (mean) Time per request: 1.213 [ms] (mean, across all concurrent requests) Transfer rate: 153.78 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.5 0 4 Processing: 6 121 30.1 124 230 Waiting: 4 119 30.2 123 228 Total: 7 121 30.0 124 230 Percentage of the requests served within a certain time (ms) 50% 124 66% 132 75% 138 80% 144 90% 157 95% 167 98% 181 99% 189 100% 230 (longest request) ``` ## Related issue number  ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com> Signed-off-by: Roshan Kathawate <roshankathawate@gmail.com>

edoakes added 2 commits January 2, 2025 09:11

replace uuid.uuid4()

f9eaf61

Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>

replace uuid.uuid4()

bb3cedf

Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>

edoakes requested a review from zcin January 2, 2025 15:21

r

388e403

Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>

edoakes added the go add ONLY when ready to merge, run all tests label Jan 2, 2025

edoakes requested a review from GeneDer January 2, 2025 17:08

edoakes enabled auto-merge (squash) January 2, 2025 17:08

GeneDer approved these changes Jan 2, 2025

View reviewed changes

edoakes merged commit d7ad9a5 into ray-project:master Jan 2, 2025
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[serve] Replace `uuid.uuid4()` with `getrandbits` #49537

[serve] Replace `uuid.uuid4()` with `getrandbits` #49537

edoakes commented Jan 2, 2025 •

edited

Loading

GeneDer left a comment

[serve] Replace uuid.uuid4() with getrandbits #49537

[serve] Replace uuid.uuid4() with getrandbits #49537

Conversation

edoakes commented Jan 2, 2025 • edited Loading

Why are these changes needed?

Related issue number

Checks

GeneDer left a comment

Choose a reason for hiding this comment

[serve] Replace `uuid.uuid4()` with `getrandbits` #49537

[serve] Replace `uuid.uuid4()` with `getrandbits` #49537

edoakes commented Jan 2, 2025 •

edited

Loading