Skip to content

Blog post about CASE optimization#122

Merged
alamb merged 31 commits intoapache:mainfrom
pepijnve:case
Feb 2, 2026
Merged

Blog post about CASE optimization#122
alamb merged 31 commits intoapache:mainfrom
pepijnve:case

Conversation

@pepijnve
Copy link
Contributor

Covers the work done as part of apache/datafusion#18075

@pepijnve
Copy link
Contributor Author

Current state is a very crude first draft written by Claude AI

@pepijnve pepijnve marked this pull request as ready for review December 20, 2025 11:10
@pepijnve
Copy link
Contributor Author

@alamb I finally found some time to get through what the bot produced. I think this is now in a good enough shape for a first review.

@alamb
Copy link
Contributor

alamb commented Dec 20, 2025

@alamb I finally found some time to get through what the bot produced. I think this is now in a good enough shape for a first review.

Thank you -- I will try and review it over the next few days

@pepijnve
Copy link
Contributor Author

After reading the excellent consecutive repartitioning post I think there might be some more polishing work to do on this one.

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, thank you @pepijnve -- this is great and a really strong piece.

I left a bunch of polish suggestions, any/all of which I am more than happy to help implement.

Another optimization that I think would fit well into this blog would be the optimization for constant tables @rluvaton added in apache/datafusion#18183 (will be released in DataFusion 52)

@alamb alamb changed the title Add blog post regarding CASE work Blog post about CASE optimization Jan 10, 2026
@alamb
Copy link
Contributor

alamb commented Jan 10, 2026

With the impending release of DataFusion 52.0.0

I am hoping we can publish blog early next week (Jan 12, 13) so that we can then refer to it in the DataFusion 52 release blog

I think this one is looking pretty good, though I had been dreaming about more diagrams, I don't think they are requred.

Do you think this is ready to go from your perspective @pepijnve ? Would you mind if I took a pass trhough to clean up some formatting (like the title)?

Screenshot 2026-01-10 at 2 57 44 PM

@pepijnve
Copy link
Contributor Author

I've been meaning to add those diagrams and do another round of editing. Not much progress due to the holiday break and other priorities.

@alamb
Copy link
Contributor

alamb commented Jan 11, 2026

I've been meaning to add those diagrams and do another round of editing. Not much progress due to the holiday break and other priorities.

Me too! No worries. I'll check back in a few days -- anything I can do to help?

@pepijnve
Copy link
Contributor Author

@alamb I've gone through the article entirely again. Would be good to get a fresh pair of eyes to review it again; I've been looking at this thing for too long.

@pepijnve pepijnve requested a review from alamb January 26, 2026 12:46
@pepijnve
Copy link
Contributor Author

@rluvaton I've added a section on your hash table work in here as well. Any feedback on that would also be welcome. Let me know if you would like me to add something to the acknowledgments section for this part (see https://datafusion.apache.org/blog/2025/12/15/avoid-consecutive-repartitions/#acknowledgements for an example).

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me @pepijnve

I have some small style suggestions, and some minor copy edits for your review as well

Overall I think this is really great

Also, thanks @rluvaton for both the contribution to the code and the writeup

pepijnve and others added 9 commits January 27, 2026 15:47
Minor copy editing for CASE blog
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
@pepijnve
Copy link
Contributor Author

@alamb all comments processed. TYVM for the editorial work. After staring at the same paragraph for too long I start to miss these kinds of details.

@alamb
Copy link
Contributor

alamb commented Jan 27, 2026

I think this blog is great. I'll maybe do a call for some more review and then we can plan to publish it later this week or early next week

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
@alamb
Copy link
Contributor

alamb commented Jan 30, 2026

I am thinking we should plan to publish this next Monday, Feb 2. Is that ok with you @pepijnve ?

@pepijnve
Copy link
Contributor Author

Is that ok with you @pepijnve ?

All good for me

@alamb
Copy link
Contributor

alamb commented Feb 2, 2026

I updated the blog date to today, 2026-02-02

@alamb alamb merged commit ce4dfc5 into apache:main Feb 2, 2026
1 check passed
@alamb
Copy link
Contributor

alamb commented Feb 2, 2026

And the blog is live: https://datafusion.apache.org/blog/2026/02/02/datafusion_case/

Thank you @pepijnve and @rluvaton ✍️ ❤️

@pepijnve
Copy link
Contributor Author

pepijnve commented Feb 2, 2026

🎉

@alamb in the meantime I got some feedback from our tech writer. Still need to process those. Are we ok with making edits to the blog post still or is it set in stone? I don't think there are any substantial contents changes, some small stylistic things.

@alamb
Copy link
Contributor

alamb commented Feb 2, 2026

🎉

@alamb in the meantime I got some feedback from our tech writer. Still need to process those. Are we ok with making edits to the blog post still or is it set in stone? I don't think there are any substantial contents changes, some small stylistic things.

I think updating the blog with stylistic changes is a great idea -- just make a new PR and we can merge it / update the post

@pepijnve
Copy link
Contributor Author

pepijnve commented Feb 2, 2026

Logged in #142

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants