Skip to content

Add blogpost on flox heuristics #695

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 25 commits into
base: main
Choose a base branch
from
Draft
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update
  • Loading branch information
dcherian committed Aug 1, 2024
commit 1c19f7fd486b67ffe120f73e799bcb07a4e068a4
3 changes: 1 addition & 2 deletions src/posts/flox-smart/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,8 +150,7 @@ Given the above `C`, flox will choose `"cohorts"` for chunk sizes (1, 2, 3, 4, 6

Cool, isn't it?!

Importantly this inference is fast — [400ms for the US county](https://flox.readthedocs.io/en/latest/implementation.html#example-spatial-grouping) GroupBy problem in our [previous post](https://xarray.dev/blog/flox) where the group labels are a 2GB 2D array!
But we have not tried with bigger problems (example: GroupBy(100,000 watersheds) in the US).
Importantly this inference is fast — [~250ms for the US county](https://flox.readthedocs.io/en/latest/implementation.html#example-spatial-grouping) GroupBy problem in our [previous post](https://xarray.dev/blog/flox) where approximately 3000 groups are distributed over 2500 chunks; and ~1.25s for grouping by US watersheds ~87000 groups across 640 chunks.

## What's next?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your conclusion is a bit rambly, and I think some of the content (especially call-outs to any other groupby nerds who can help) could be in a footnote.

I think you should end with a succinct one-line conclusion that emphasises that most of this complexity can now be safely ignored by users.


Expand Down