Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-6763] Optimize logic to select the tiles with the fewest rows #4125

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

xiaochen-zhou
Copy link
Contributor

Our current logic using a PriorityQueue to select the tile with the fewest rows when multiple tiles are available. However, there are several potential issues:

  1. The PriorityQueue stores all satisfiable tiles. When the number of tiles is large, maintaining the heap structure during element insertion has a time complexity of O(log n), which also increases memory usage.
  2. The initial size of the PriorityQueue is difficult to estimate and is currently set to 1, causing frequent resizing of the PriorityQueue.

We can optimize the code by keeping only a single bestCandidate tile to improve performance.

Copy link
Contributor

@mihaibudiu mihaibudiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no tests - is there any code calling this function?
If yes, then it's perhaps fine - the behavior is not supposed to change.

@xiaochen-zhou
Copy link
Contributor Author

xiaochen-zhou commented Jan 7, 2025

There are no tests - is there any code calling this function? If yes, then it's perhaps fine - the behavior is not supposed to change.

RelOptLattice#getAggregate() calling this function.

@mihaibudiu mihaibudiu added the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label Jan 7, 2025
@mihaibudiu
Copy link
Contributor

you can probably squash the commits

@xiaochen-zhou
Copy link
Contributor Author

you can probably squash the commits

Done.

Copy link

sonarqubecloud bot commented Jan 8, 2025

@mihaibudiu
Copy link
Contributor

There are some comments in Jira from @julianhyde, so I will wait to merge this.

@xiaochen-zhou
Copy link
Contributor Author

There are some comments in Jira from @julianhyde, so I will wait to merge this.

OK.

@mihaibudiu
Copy link
Contributor

What is the conclusion from the JIRA discussion? Can we merge this?

@xiaochen-zhou
Copy link
Contributor Author

What is the conclusion from the JIRA discussion? Can we merge this?

I recently attempted to add a benchmark by referring to CALCITE-4994 and CALCITE-3827.

@mihaibudiu
Copy link
Contributor

So should we wait for benchmark results?

@xiaochen-zhou
Copy link
Contributor Author

So should we wait for benchmark results?

Yes.

}
}
}
if (!queue.isEmpty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we have a

if (bestCandidate != null)
  return bestCandidate;

at this place in the patch to maintain the previous logic?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LGTM-will-merge-soon Overall PR looks OK. Only minor things left.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants