Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add a spark.comet.exec.memoryPool configuration for experimenting with various datafusion memory pool setups. #1021

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

Kontinuation
Copy link
Member

@Kontinuation Kontinuation commented Oct 16, 2024

Which issue does this PR close?

This issue relates to #996 and #1004

Rationale for this change

This is for investigating various approaches to simplify memory-related configuration and reduce the memory required to run large queries. @andygrove

What changes are included in this PR?

This PR adds a spark.comet.exec.memoryPool configuration for easily running queries using various memory pool setups.

  • greedy: Each operator has its own GreedyMemoryPool, which is the same as the current situation.
  • fair_spill: Each operator has its own FairSpillPool
  • greedy_task_shared (default): All operators for the same task attempt share the same GreedyMemoryPool.
  • fair_spill_task_shared: All operators for the same task attempt share the same FairSpillPool
  • greedy_global: All operators in the same executor instance share the same GreedyMemoryPool
  • fair_spill_global: All operators in the same executor instance share the same FairSpillPool

How are these changes tested?

TODO: add tests running in native memory management mode.

@codecov-commenter
Copy link

codecov-commenter commented Oct 16, 2024

Codecov Report

Attention: Patch coverage is 77.27273% with 5 lines in your changes missing coverage. Please review.

Project coverage is 34.27%. Comparing base (591f45a) to head (bd5a0c7).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...ain/scala/org/apache/comet/CometExecIterator.scala 73.68% 2 Missing and 3 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1021      +/-   ##
============================================
- Coverage     34.30%   34.27%   -0.03%     
- Complexity      887      889       +2     
============================================
  Files           112      112              
  Lines         43429    43502      +73     
  Branches       9623     9615       -8     
============================================
+ Hits          14897    14912      +15     
- Misses        25473    25542      +69     
+ Partials       3059     3048      -11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@andygrove
Copy link
Member

Thanks for filing this @Kontinuation. I will close my PRs.

Also, there is a suggestion in #1017 for always using unified memory.

@Kontinuation
Copy link
Member Author

Thanks for filing this @Kontinuation. I will close my PRs.

Also, there is a suggestion in #1017 for always using unified memory.

I agree that using the unified memory manager is a better approach. Vanilla Spark operators and comet operators are governed by the same memory manager and they are all using offheap memory. Vanilla Spark operators can free some memory when comet operators are under pressure. I'll also put more work into improving the unified memory management.

I think the native memory management approach may still be relevant when users don't want vanilla Spark to use off-heap memory. We can set the default value of spark.comet.exec.memoryPool as greedy_task_shared to make comet memory overhead configuration more intuitive. If users want memory-intensive operators to spill properly, they can try out fair_spill_task_shared.

@Kontinuation Kontinuation marked this pull request as ready for review October 18, 2024 04:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants