Skip to content

DRILL-4411: hash join should limit batch based on size and number of records#381

Closed
minji-kim wants to merge 1 commit intoapache:masterfrom
minji-kim:DRILL-4411
Closed

DRILL-4411: hash join should limit batch based on size and number of records#381
minji-kim wants to merge 1 commit intoapache:masterfrom
minji-kim:DRILL-4411

Conversation

@minji-kim
Copy link
Contributor

Right now, hash joins can run out of memory if records are large since the batch is limited only by size (of 4000). This patch implements a simple heuristic. If the allocator for the outputs become larger than 10 MB before outputing 4000 records (say 2000), then set the batch size limit to 2000 for the future batches.

private HashJoinBatch outgoingJoinBatch = null;

private static final int TARGET_RECORDS_PER_BATCH = 4000;
private int targetRecordsPerBatch = 4000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a comment here about when this value will be mutated as we are moving it away from immutability, especially since most of the other operators currently have this as an immutable value.

@minji-kim
Copy link
Contributor Author

I made it such that it doesn't adjust the batch size once, but keep the minimum batch size (in terms of number of records) to be at least 1.

}

container.clear();
outputAllocator.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outputAllocator can be null, right ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it could be if we close before calling next()/buildSchema(). I will add a check for null.

…ber of records

Make batch size (in bytes and records) adjustments more than once

Minor change: check for null outputAllocator before closing
@ilooner
Copy link
Contributor

ilooner commented Jun 1, 2018

@minji-kim Thanks for putting the effort into making this change. Unfortunately a sophisticated version of this was recently completed in this PR #1227 and will be merged soon. Since the objectives of the two changes were the same I will close this PR. Sorry that your change did not receive any attention earlier.

@ilooner ilooner closed this Jun 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants