Skip to content

Commit

Permalink
Fix reviews
Browse files Browse the repository at this point in the history
  • Loading branch information
rickyma committed Apr 16, 2024
1 parent fe0fd42 commit 23e2267
Showing 1 changed file with 26 additions and 25 deletions.
51 changes: 26 additions & 25 deletions docs/benchmark_netty.md → docs/benchmark_netty_case_report.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,28 +87,28 @@ select SUM(IFNULL(CAST(ss_sold_time_sk AS DECIMAL(10, 2)), 0) + IFNULL(CAST(ss_i

Total: Read 10.7TiB, Write 6.4TiB

| Concurrent Tasks | Type | Single Shuffle Server Write Speed | Single Shuffle Server Read Speed | Tasks Total Time | Netty(SSD) Performance Improvement | Notes |
|------------------|---------------|-----------------------------------|----------------------------------|------------------|------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 1400 | Netty(SSD) | 0.93GB/s | 1.56GB/s | 268.7h | - | |
| | gRPC(SSD) | 0.75GB/s | 1.25GB/s | 330.4h | 18.67% | |
| | Netty(HDD) | 0.24GB/s | 0.4GB/s | 1024.4h | 73.77% | |
| | Spark ESS | 0.5GB/s | 0.82GB/s | 525.5h | 48.88% | |
| | Vanilla Spark | - | - | __*Failed*__ | - | |
| 2800 | Netty(SSD) | 1.02GB/s | 1.70GB/s | 450.7h | - | |
| | gRPC(SSD) | 0.86GB/s | 1.44GB/s | 566.4h | 20.42% | |
| | Netty(HDD) | 0.24GB/s | 0.4GB/s | 2009.9h | 77.6% | |
| | Spark ESS | 0.5GB/s | 0.68GB/s | 672.3h | 32.96% | |
| | Vanilla Spark | - | - | __*Failed*__ | - | |
| 5600 | Netty(SSD) | 1.02GB/s | 1.70GB/s | 896.2h | - | |
| | gRPC(SSD) | 0.80GB/s | 1.34GB/s | 1145.1h | 21.72% | |
| | Netty(HDD) | 0.22GB/s | 0.36GB/s | 4671.3h | 80.8% | |
| | Spark ESS | - | - | __*Failed*__ | - | |
| | Vanilla Spark | - | - | __*Failed*__ | - | |
| 11200 | Netty(SSD) | 0.86GB/s | 1.44GB/s | 1783.1h | - | |
| | gRPC(SSD) | 0.62GB/s | 1.04GB/s | 2028.2h | 12.08% | At 11200 concurrency, gRPC requires reducing `rss.rpc.executor.size` to 200 to run tasks successfully. Shuffle Server memory usage and CPU load are higher in gRPC mode than in Netty mode. Not recommended. |
| | Netty(HDD) | 0.20GB/s | 0.34GB/s | 8716.5h | 79.5% | |
| | Spark ESS | - | - | __*Failed*__ | - | |
| | Vanilla Spark | - | - | __*Failed*__ | - | |
| Concurrent Tasks | Type | Single Shuffle Server Write Speed | Single Shuffle Server Read Speed | Tasks Total Time | Netty(SSD) Speedup | Netty(SSD) Total Task Time Reduction | Notes |
|------------------|---------------|-----------------------------------|----------------------------------|------------------|--------------------|--------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 1400 | Netty(SSD) | 0.93GB/s | 1.56GB/s | 268.7h | - | - | |
| | gRPC(SSD) | 0.75GB/s | 1.25GB/s | 330.4h | 123.02% | 18.67% | |
| | Netty(HDD) | 0.24GB/s | 0.4GB/s | 1024.4h | 381.12% | 73.77% | |
| | Spark ESS | 0.5GB/s | 0.82GB/s | 525.5h | 195.56% | 48.88% | |
| | Vanilla Spark | - | - | __*Failed*__ | - | - | |
| 2800 | Netty(SSD) | 1.02GB/s | 1.70GB/s | 450.7h | - | - | |
| | gRPC(SSD) | 0.86GB/s | 1.44GB/s | 566.4h | 125.64% | 20.42% | |
| | Netty(HDD) | 0.24GB/s | 0.4GB/s | 2009.9h | 445.83% | 77.6% | |
| | Spark ESS | 0.5GB/s | 0.68GB/s | 672.3h | 149.19% | 32.96% | |
| | Vanilla Spark | - | - | __*Failed*__ | - | - | |
| 5600 | Netty(SSD) | 1.02GB/s | 1.70GB/s | 896.2h | - | - | |
| | gRPC(SSD) | 0.80GB/s | 1.34GB/s | 1145.1h | 127.74% | 21.72% | |
| | Netty(HDD) | 0.22GB/s | 0.36GB/s | 4671.3h | 520.98% | 80.8% | |
| | Spark ESS | - | - | __*Failed*__ | - | - | |
| | Vanilla Spark | - | - | __*Failed*__ | - | - | |
| 11200 | Netty(SSD) | 0.86GB/s | 1.44GB/s | 1783.1h | - | - | |
| | gRPC(SSD) | 0.62GB/s | 1.04GB/s | 2028.2h | 113.74% | 12.08% | At 11200 concurrency, gRPC requires reducing `rss.rpc.executor.size` to 200 to run tasks successfully. Shuffle Server memory usage and CPU load are higher in gRPC mode than in Netty mode. Not recommended. |
| | Netty(HDD) | 0.20GB/s | 0.34GB/s | 8716.5h | 488.61% | 79.5% | |
| | Spark ESS | - | - | __*Failed*__ | - | - | |
| | Vanilla Spark | - | - | __*Failed*__ | - | - | |

Note:

Expand All @@ -123,10 +123,11 @@ Netty(SSD) Performance Improvement = (Tasks Total Time - Tasks Total Time( Netty

We can draw the following conclusions:

1. At 1400 concurrency, Vanilla Spark is already unable to complete tasks successfully, and at 5600 concurrency, Spark
1. At 1400 concurrency, Vanilla Spark is already incapable of successfully completing tasks, and at 5600 concurrency,
Spark
ESS also fails to complete tasks. However, whether it is HDD or SSD, and whether it is gRPC mode or Netty mode,
Uniffle can all run normally. **Uniffle can significantly improve job stability in high-pressure scenarios**.
2. When comparing using SSDs, **Netty mode brings about a 20% performance improvement compared to gRPC mode**.
3. When comparing with Netty mode turned on, **SSD brings about an 80% performance improvement compared to HDD**.
2. When comparing using SSDs, **Netty mode brings about a 20% of total task time reduction compared to gRPC mode**.
3. When comparing with Netty mode turned on, **SSD brings about an 80% of total task time reduction compared to HDD**.
4. **Above 11200 concurrency, it is not recommended to use gRPC mode**, as gRPC mode will cause the machine's load
to be much higher than Netty mode, and the Shuffle Server's process will consume more memory on the machine.

0 comments on commit 23e2267

Please sign in to comment.