Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-6960][VL] Limit Velox untracked global memory manager's usage #6988

Merged
merged 7 commits into from
Aug 26, 2024

Conversation

zhztheplayer
Copy link
Member

@zhztheplayer zhztheplayer commented Aug 23, 2024

Limit Velox's global memory manager's usage to 0.75 * Spark overhead memory by default. The overhead memory is calculated by same manner with vanilla Spark.

Velox's global memory manager is used for some global allocations that don't belong to a specific query or task. These allocations are not tracked by Spark task memory manager.

After the patch, spark.memory.offHeap.size and spark.executor.memoryOverhead (with spark.executor.memoryOverheadFactor / spark.executor.minMemoryOverhead for higher version Spark, probably) will co-work for Velox backend's memory management.

  1. spark.memory.offHeap.size limits major of the memory allocations happened during Velox query execution, e.g., hash tables, sort buffers, shuffle buffers, etc.
  2. spark.executor.memoryOverhead limits the memory allocations that are hardly tracked by Spark task memory manager, or that don't belong to a specific query. E.g., allocations happen during spilling, or possibly the global cache size (unimplemented).

Edit: As the change fails a couple of CI tests, we are setting the internal Velox global poll size limit to Long.MaxValue by default in the PR, to bypass the check for sometime, until the relevant bugs from Velox get fixed.

@github-actions github-actions bot added CORE works for Gluten Core VELOX labels Aug 23, 2024
Copy link

#6960

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@zhztheplayer
Copy link
Member Author

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@zhztheplayer
Copy link
Member Author

zhztheplayer commented Aug 26, 2024

CH failure doesn't seem to be related cc @zzcclp

https://opencicd.kyligence.com/job/gluten/job/gluten-ci/11749

@zhztheplayer
Copy link
Member Author

Run Gluten Clickhouse CI

@zhztheplayer zhztheplayer merged commit 603faba into apache:main Aug 26, 2024
49 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CORE works for Gluten Core VELOX
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants