Conversation
|
Current state is a very crude first draft written by Claude AI |
|
@alamb I finally found some time to get through what the bot produced. I think this is now in a good enough shape for a first review. |
Thank you -- I will try and review it over the next few days |
|
After reading the excellent consecutive repartitioning post I think there might be some more polishing work to do on this one. |
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
alamb
left a comment
There was a problem hiding this comment.
First of all, thank you @pepijnve -- this is great and a really strong piece.
I left a bunch of polish suggestions, any/all of which I am more than happy to help implement.
Another optimization that I think would fit well into this blog would be the optimization for constant tables @rluvaton added in apache/datafusion#18183 (will be released in DataFusion 52)
|
With the impending release of DataFusion 52.0.0 I am hoping we can publish blog early next week (Jan 12, 13) so that we can then refer to it in the DataFusion 52 release blog I think this one is looking pretty good, though I had been dreaming about more diagrams, I don't think they are requred. Do you think this is ready to go from your perspective @pepijnve ? Would you mind if I took a pass trhough to clean up some formatting (like the title)?
|
|
I've been meaning to add those diagrams and do another round of editing. Not much progress due to the holiday break and other priorities. |
Me too! No worries. I'll check back in a few days -- anything I can do to help? |
|
@alamb I've gone through the article entirely again. Would be good to get a fresh pair of eyes to review it again; I've been looking at this thing for too long. |
|
@rluvaton I've added a section on your hash table work in here as well. Any feedback on that would also be welcome. Let me know if you would like me to add something to the acknowledgments section for this part (see https://datafusion.apache.org/blog/2025/12/15/avoid-consecutive-repartitions/#acknowledgements for an example). |
alamb
left a comment
There was a problem hiding this comment.
Minor copy editing for CASE blog
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
|
@alamb all comments processed. TYVM for the editorial work. After staring at the same paragraph for too long I start to miss these kinds of details. |
|
I think this blog is great. I'll maybe do a call for some more review and then we can plan to publish it later this week or early next week |
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
|
I am thinking we should plan to publish this next Monday, Feb 2. Is that ok with you @pepijnve ? |
All good for me |
|
I updated the blog date to today, 2026-02-02 |
|
And the blog is live: https://datafusion.apache.org/blog/2026/02/02/datafusion_case/ |
|
🎉 @alamb in the meantime I got some feedback from our tech writer. Still need to process those. Are we ok with making edits to the blog post still or is it set in stone? I don't think there are any substantial contents changes, some small stylistic things. |
I think updating the blog with stylistic changes is a great idea -- just make a new PR and we can merge it / update the post |
|
Logged in #142 |

Covers the work done as part of apache/datafusion#18075