Skip to content

SortMergeJoinExec fails to allocate memory but should spill instead #2452

@andygrove

Description

@andygrove

Describe the bug

In some configurations/environments, I see queries fail due to memory pool requests being rejected, but I would expect Comet to spill to disk instead.

In one example, I am running TPC-H @ SF=1000 (1TB) in k8s. I am specifying spark.comet.exec.replaceSortMergeJoin=false to force the use of CometSortMergeJoinExec.

    --conf spark.executor.instances=4 \
    --conf spark.executor.cores=8 \
    --conf spark.executor.memory=8G \
    --conf spark.memory.offHeap.enabled=true \
    --conf spark.memory.offHeap.size=4g \

I allocated 4 GB of off-heap memory, which equates to 512 MB per core.

I saw memory requests fail with the memory pool limit at ~512MB.

I then doubled the off-heap memory, and I still see the same issue; however, the pool is now ~1GB. I would expect spilling to kick in instead.

org.apache.comet.CometNativeException: Additional allocation failed with top memory consumers (across reservations) as:
  ExternalSorter[107]#2991(can spill: true) consumed 1024.2 MB,
  ExternalSorterMerge[107]#2990(can spill: false) consumed 16.7 MB,
  GroupedHashAggregateStream[107] ()#2994(can spill: true) consumed 0.0 B,
  GroupedHashAggregateStream[107] ()#2995(can spill: true) consumed 0.0 B,
  ExternalSorterMerge[107]#2992(can spill: false) consumed 0.0 B,

I also see pods being killed due to OOM:

NAME                                                        READY   STATUS              RESTARTS   AGE
comet-benchmark-derived-from-tpch-a1133f997d91851b-exec-1   0/1     OOMKilled           0          11m
comet-benchmark-derived-from-tpch-a1133f997d91851b-exec-3   0/1     OOMKilled           0          11m
comet-benchmark-derived-from-tpch-a1133f997d91851b-exec-4   0/1     OOMKilled           0          11m
comet-benchmark-derived-from-tpch-a1133f997d91851b-exec-5   1/1     Running             0          4s
comet-benchmark-derived-from-tpch-a1133f997d91851b-exec-6   1/1     Running             0          4s
comet-benchmark-derived-from-tpch-a1133f997d91851b-exec-7   1/1     Running             0          3s
comet-benchmark-derived-from-tpch-a1133f997d91851b-exec-8   0/1     ContainerCreating   0          1s

I also see errors in the executor logs:

25/09/24 21:24:57 WARN ExecutionMemoryPool: Internal error: release called on 917504 bytes but task only has 0 bytes of memory from the off-heap execution pool
25/09/24 21:24:57 WARN ExecutionMemoryPool: Internal error: release called on 839664 bytes but task only has 0 bytes of memory from the off-heap execution pool
25/09/24 21:24:57 WARN ExecutionMemoryPool: Internal error: release called on 917504 bytes but task only has 0 bytes of memory from the off-heap execution pool

Some related Spark logging:

25/09/24 21:27:20 INFO TaskMemoryManager: Memory used in task 20632
25/09/24 21:27:20 INFO TaskMemoryManager: 1073741824 bytes of memory were used by task 20632 but are not associated with specific consumers
25/09/24 21:27:20 INFO TaskMemoryManager: 5677121232 bytes of memory are used for execution and 962961 bytes of memory are used for storage

Steps to reproduce

No response

Expected behavior

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions