-
Couldn't load subscription status.
- Fork 1.7k
Consolidate batch_size configuration in ExecutionConfig, RuntimeConfig and PhysicalPlanConfig
#1562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… to `FileScanConfig` to make it clearer
| ), | ||
| PartitionMethod::RoundRobin(batch_size) => { | ||
| Partitioning::RoundRobinBatch(batch_size as usize) | ||
| PartitionMethod::RoundRobin(partition_count) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
batch_size here is wrong, it's partition count
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice find
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh god, good find, 4096 partitions is not going to be fun.
|
|
||
| for batch_size in test_batch_sizes.iter() { | ||
| let rr_repartition = Partitioning::RoundRobinBatch(*batch_size); | ||
| for partition_count in test_partition_counts.iter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
| let runtime = Arc::new(RuntimeEnv::default()); | ||
| let projection = Some(vec![0, 1, 2, 3]); | ||
| let exec = get_exec("aggregate_test_100.csv", &projection, 1024, Some(1)).await?; | ||
| let exec = get_exec("aggregate_test_100.csv", &projection, Some(1)).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After this PR, most of the magic number 1024 is removed from the project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea of putting batch_size on the RuntimeConfig is 🏅 👍
| parquet_pruning: bool, | ||
| /// Runtime configurations such as memory threshold and local disk for spill | ||
| pub runtime_config: RuntimeConfig, | ||
| pub runtime: RuntimeConfig, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to runtime since I think the word config is redundant. Consider the usage exec_config.runtime_config.batch_size for example. Everything in exec_config is config.
|
Thanks @yjshen -- I'll check this out over the weekend.
I found #1526 (comment) -- sorry |
ExecutionConfig, RuntimeConfig and PhysicalPlanConfigbatch_size configuration in ExecutionConfig, RuntimeConfig and PhysicalPlanConfig
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ), | ||
| PartitionMethod::RoundRobin(batch_size) => { | ||
| Partitioning::RoundRobinBatch(batch_size as usize) | ||
| PartitionMethod::RoundRobin(partition_count) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice find
| let runtime = Arc::new(RuntimeEnv::default()); | ||
| let projection = Some(vec![0, 1, 2, 3]); | ||
| let exec = get_exec("aggregate_test_100.csv", &projection, 1024, Some(1)).await?; | ||
| let exec = get_exec("aggregate_test_100.csv", &projection, Some(1)).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea of putting batch_size on the RuntimeConfig is 🏅 👍
|
Thanks @alamb , the main reason I'm not moving I was trying to consolidate more of the configs from the three |
No, |
Which issue does this PR close?
Closes #1565 .
Rationale for this change
batch_sizeis only execution runtime related, should consolidate all intoRuntimeConfig.PhysicalPlanConfigtoFileScanConfigto make it clear.What changes are included in this PR?
Are there any user-facing changes?