what is the difference between blaze+ESS and blaze+Celeborn? #852
Unanswered
howardli9175
asked this question in
Q&A
Replies: 1 comment
-
|
Have you tested the following two setups?
I think the difference between ESS and Celeborn are that: For ESS spark writes shuffle data to local disk, for Celeborn spark writes shuffle data to remote server. So the performance is really determined by the disks IOPS/throughput(on ESS and Celeborn) and the network bandwidth. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
In my perception, celeborn is similar to ESS(external shuffle service), the difference is only that celeborn can move the shuffle service from NodeManager to external.
However, in our tpcds1000GB dataset test, we found that the query time of blaze+ESS is 20% less than the native spark, but the query time of blaze+celeborn is not significantly reduced than the native spark, and even has a small increase.
The following is the test environment:
The following are the key configurations:
spark.blaze.enable true
spark.sql.extensions org.apache.spark.sql.blaze.BlazeSparkSessionExtension
spark.shuffle.manager org.apache.spark.sql.execution.blaze.shuffle.BlazeShuffleManager
spark.memory.offHeap.enabled false
spark.shuffle.manager org.apache.spark.sql.execution.blaze.shuffle.celeborn.BlazeCelebornShuffleManager
spark.celeborn.master.endpoints xxxxx:9097
spark.celeborn.client.spark.shuffle.writer hash
spark.celeborn.client.push.replicate.enabled false
spark.blaze.enable true
spark.sql.extensions org.apache.spark.sql.blaze.BlazeSparkSessionExtension
spark.memory.offHeap.enabled false
Beta Was this translation helpful? Give feedback.
All reactions