Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HashJoinStream memory tracking insufficient #7848

Closed
crepererum opened this issue Oct 17, 2023 · 8 comments · Fixed by #8020
Closed

HashJoinStream memory tracking insufficient #7848

crepererum opened this issue Oct 17, 2023 · 8 comments · Fixed by #8020
Labels
bug Something isn't working

Comments

@crepererum
Copy link
Contributor

Describe the bug

Using a hash join may lead to OOM kills / very large memory consumption even when a memory limit is set.

To Reproduce

No reproduction steps yet.

We have a flame graph from a prod environment though:

flame

Expected behavior

Memory manager report "out of memory", the query fails.

Additional context

I think the code only tracks HashJoinStream::visited_left_side but build_equal_condition_join_indices is completely untracked.

@crepererum crepererum added the bug Something isn't working label Oct 17, 2023
@alamb
Copy link
Contributor

alamb commented Oct 17, 2023

cc @korowa who I think did the initial work for Join memory tracking in case they have ideas.

@korowa
Copy link
Contributor

korowa commented Oct 18, 2023

Can't remember of anything "leaky" except for build-side data and visited build-side indices 🤔 -- I'll check it out and try to submit PR

@korowa
Copy link
Contributor

korowa commented Oct 18, 2023

I was able to reproduce it with cross join (all records in both build and probe sides have the same key, and build side is small enough to fit in memory limit) -- it's true, "working" batch is not tracked at all. I guess the most proper solution would be to produce partially joined batch (in the same fashion like it works in MergeJoin), rather than accumulating output in memory until memory limit violated.

Any thoughts on this option?

UPD: but still, it's worth having as precise memory management as possible -- I'll take a stab at additional memory tracking for working batch, if there are no objections.

@alamb
Copy link
Contributor

alamb commented Oct 18, 2023

@korowa I think adding support for the working batch sounds like a good idea, as long as it doesn't get too complex.

Thank you for looking into this

@alamb
Copy link
Contributor

alamb commented Oct 30, 2023

@korowa is there any update on this? If now, I may try and work on it this week

@korowa
Copy link
Contributor

korowa commented Oct 30, 2023

Unfortunately I've started only a day ago with first option (emitting batches from hash join) -- so, I hope during this week there will be at least draft for this. But, ofc, feel free to pick this issue!

@alamb
Copy link
Contributor

alamb commented Oct 30, 2023

Thanks @korowa -- I'll let you know

@alamb
Copy link
Contributor

alamb commented Nov 13, 2023

Related ticket: #8130 which I believe is blocking this code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants