Skip to content

Commit

Permalink
add comment
Browse files Browse the repository at this point in the history
  • Loading branch information
trueeyu committed Sep 8, 2021
1 parent 73dd93f commit 86c906d
Showing 1 changed file with 7 additions and 0 deletions.
7 changes: 7 additions & 0 deletions be/src/exec/vectorized/hash_join_node.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -350,6 +350,13 @@ bool HashJoinNode::_has_null(const ColumnPtr& column) {
Status HashJoinNode::_build(RuntimeState* state) {
{
SCOPED_TIMER(_build_conjunct_evaluate_timer);
// Currently, in order to achieve simplicity, HashJoinNode uses BigChunk,
// Splice the Chunks from Scan on the right table into a big Chunk
// In some scenarios, such as when the left and right tables are selected incorrectly
// or when the large table is joined, the (BinaryColumn) in the Chunk exceeds the range of uint32_t,
// which will cause the output of wrong data.
// Currently, a defense needs to be added.
// After a better solution is available, the BigChunk mechanism can be removed.
if (_ht.get_build_chunk()->exceed_capacity_limit()) {
return Status::InternalError("Total size of single column exceed the limit of hash join");
}
Expand Down

0 comments on commit 86c906d

Please sign in to comment.