restore topk pre-filtering of batches and make sort query fuzzer less sensitive to expected non determinism #16501

alamb · 2025-06-22T13:06:38Z

Which issue does this PR close?

Closes SortQueryFuzzer found a failing case on main #16452

Rationale for this change

@AdamGS removed some of the code here Temporarily fix bug in dynamic top-k optimization #16465
However, the test kept failing
@adriangb fixed what we think is the real issue here: re-enable sort_query_fuzzer_runner #16491

What changes are included in this PR?

Let's restore the removed code in Temporarily fix bug in dynamic top-k optimization #16465

Are these changes tested?

Are there any user-facing changes?

…)" This reverts commit 5ca4ff0.

adriangb · 2025-06-22T13:42:01Z

We have the test in https://github.com/apache/datafusion/pull/16465/files#diff-f38cac7a9ac55c93d71632c96d6d2afa219cfb07351125a349099c86df859446 which seems to be passing. I'm running a local 1200 run iteration to confirm.

adriangb · 2025-06-22T13:51:12Z

Sadly the 1200 run still reports failures 😭

I feel like @AdamGS 's original intuition that it's something about sort stabilit with nulls is correct. I'll see if I can find a fix...

alamb · 2025-06-22T14:03:15Z

Thanks @adriangb !

adriangb · 2025-06-22T17:17:12Z

So from my investigation what I think is happening is that #15770 fundamentally converted the TopK operation from being isolated per partition to having shared state via the dynamic filter. This causes some non-determinism with test runs since partitions can interact. I think this doesn't cause actual issues with queries, but the tests are picking it up. But I'm not 100% sure about that. @Dandandan and I were already talking about having a shared TopK heap between partitions, I think that would resolve the issue. But otherwise more investigation is needed.

FWIW the TopK dynamic filters still work without this code - it's just using the filter to filter rows in the TopK operator itself that doesn't work.

This is all I had time for today. I think more work is needed before we can merge this PR in the current state.

alamb · 2025-06-23T15:00:31Z

r. This causes some non-determinism with test runs since partitions can interact. I think this doesn't cause actual issues with queries, but the tests are picking it up.

This sounds like we need to update the tests to be deterministic then somehow (or ignore results that are not deterministic

AdamGS · 2025-06-23T15:24:42Z

Would love to give a hand with that, I have some thoughts I can try and put into a preliminary PR.
It also seems like Datafusion is going to have more of this shared state that's sensetive to how event interleave, and it might be worth it to make a larger effort to enable (more) deterministing simulation.

adriangb · 2025-06-23T15:31:30Z

Thank you @AdamGS! It would be super helpful if we could first determine if the test is being overly sensitive to non-determinism (query results are exactly the same across runs and correct but test still fails) or if the issue is actually reflecting incorrect query results or non-deterministic query results (e.g. the query is correct according to the sort order but the actual order of rows is different across runs).

datafusion/physical-plan/src/topk/mod.rs

adriangb · 2025-06-27T17:21:34Z

Thank you @AdamGS! It would be super helpful if we could first determine if the test is being overly sensitive to non-determinism (query results are exactly the same across runs and correct but test still fails) or if the issue is actually reflecting incorrect query results or non-deterministic query results (e.g. the query is correct according to the sort order but the actual order of rows is different across runs).

Wondering if you've had a chance to look into this? I am trying to decide how much time I should allocate to this next week and make sure I don't overlap your work. Thanks!

AdamGS · 2025-06-27T17:39:02Z

Had a pretty busy week and this is mostly extra-curricular work for me so I didn't get anything done, planning to spend some time over the weekend digging into it. If I get anywhere worthwhile I'll send it your way sunday evening/monday morning?

adriangb · 2025-06-27T17:43:49Z

Amazing great!

adriangb · 2025-06-29T15:20:46Z

I've pushed a series of commits that simplify the test down to a trivial example:

u64	u32
1	1
2	1

SELECT * FROM t ORDER BY u32 LIMIT 1

Now it's clear what is happening: previously each partition had ORDER BY u32 LIMIT 1 applied internally and then they got combined globally, so results were always deterministic.

So nothing to do with nulls, nulls just happened to make it much more likely that the sort values would be the same when using random data.

With the addition of pre-filtering rows in the TopK operator, since the filter is updated across partitions, as soon as 1 partition sees the value for u32 and has filled up it's limit the other partitions will drop the data on the floor. But which partition sees the value first and "wins" is non-deterministic, thus which value for u64 comes out is also non deterministic. This isn't necessarily a bad thing, I believe many other database systems work like this (probably for similar reasons), but we do need to do something about this because we can't have failing CI. Even if we had a shared TopK heap I think we'd have the same issue because which row ends up in the TopK heap will still be non-deterministic.

Options I can see:

Modify the fuzzer to account for the fact that this non-determinism is okay.
Make the filters be updated per-partition, which probably looses some performance (I think not too much?).

Dandandan · 2025-06-29T16:41:21Z

Amazing, you found it!

Modify the fuzzer to account for the fact that this non-determinism is okay.

I think this is the way to go because the engine is correct in this case (sort doesn't care about partition/scan order, should just output top n rows in database).

adriangb · 2025-06-29T17:09:36Z

Sounds good, but how do we go about that? It's a pretty big change from "check results match row by row" to "check results are semantically correct".

Dandandan · 2025-06-29T17:37:44Z

I think a relative simple solution might be only checking columns from the generated sort expressions to be have equal values (i.e. unstable sorting).
FYI: 2010YOUY01 what do you think?

alamb · 2025-07-01T14:15:41Z

Modify the fuzzer to account for the fact that this non-determinism is okay.

I agree this is the right way to go

adriangb · 2025-07-01T15:54:14Z

Since there seems to be agreement on the path forward I pushed 514ab74 which I think achieves the goal by simply changing SELECT * to SELECT <same columns we're doing ordering by>. Then we can continue to assert that the batches are equal, etc. I considered a more complex system where we keep track of the ordering columns and use those in assertions but it would require a more extensive refactor. I do think if there was an easy way to verify that the output data was correctly ordered (e.g. implementing a naive hand crafted sort that is inefficient but easy to verify for correctness) that would be nice, but it seems orthogonal to this PR.

adriangb · 2025-07-01T15:56:07Z

@Dandandan @alamb @AdamGS can you review and verify you agree with the proposed change to the tests? Would love to merge this and close this chapter in preparation for the next release 😄

adriangb · 2025-07-01T16:24:00Z

Since it seemed we came to a clear repro and conclusion and the best way to know if the fuzzer is still flakey or not is to run it a lot I went ahead and merged this so it gets shaken out a lot in CI before the next release.

alamb · 2025-07-01T20:41:31Z

Amazing -- thank you @adriangb and @Dandandan

alamb added 2 commits June 22, 2025 09:03

Revert "Temporarily fix bug in dynamic top-k optimization (apache#16465…

9e28c17

…)" This reverts commit 5ca4ff0.

restore

02b6fad

github-actions bot added the physical-plan Changes to the physical-plan crate label Jun 22, 2025

alamb changed the title ~~Alamb/revert fix~~ Restore topk filtering tests Jun 22, 2025

This was referenced Jun 22, 2025

Temporarily fix bug in dynamic top-k optimization #16465

Merged

SortQueryFuzzer found a failing case on main #16452

Closed

alamb commented Jun 26, 2025

View reviewed changes

datafusion/physical-plan/src/topk/mod.rs Show resolved Hide resolved

adriangb mentioned this pull request Jun 27, 2025

Only update TopK dynamic filters if the new ones are more selective #16433

Open

adriangb added 4 commits June 29, 2025 09:03

tweak test runner to allow more partitions

0dda78b

simplify test

b990214

simplify test

cd758b1

simplify test

694f455

github-actions bot added the core Core DataFusion crate label Jun 29, 2025

adriangb added 3 commits June 29, 2025 09:35

fmt

b9e1508

simplify test even more

9760194

Revert test runner changes

da2aefb

remove NULLS FIRST

cbf5d09

alamb mentioned this pull request Jul 1, 2025

Release DataFusion 49.0.0 (July 2025) #16235

Open

34 tasks

only pull out selected columns

514ab74

adriangb marked this pull request as ready for review July 1, 2025 15:52

fmt

423716a

adriangb requested a review from Dandandan July 1, 2025 15:55

Dandandan approved these changes Jul 1, 2025

View reviewed changes

adriangb changed the title ~~Restore topk filtering tests~~ restore topk pre-filtering of batches and make sort query fuzzer less sensitive to expected non determinism Jul 1, 2025

adriangb merged commit 9bb309c into apache:main Jul 1, 2025
28 checks passed

restore topk pre-filtering of batches and make sort query fuzzer less sensitive to expected non determinism #16501

restore topk pre-filtering of batches and make sort query fuzzer less sensitive to expected non determinism #16501

Uh oh!

Conversation

alamb commented Jun 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

adriangb commented Jun 22, 2025

Uh oh!

adriangb commented Jun 22, 2025

Uh oh!

alamb commented Jun 22, 2025

Uh oh!

adriangb commented Jun 22, 2025

Uh oh!

alamb commented Jun 23, 2025

Uh oh!

AdamGS commented Jun 23, 2025

Uh oh!

adriangb commented Jun 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

adriangb commented Jun 27, 2025

Uh oh!

AdamGS commented Jun 27, 2025

Uh oh!

adriangb commented Jun 27, 2025

Uh oh!

adriangb commented Jun 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Dandandan commented Jun 29, 2025

Uh oh!

adriangb commented Jun 29, 2025

Uh oh!

Dandandan commented Jun 29, 2025

Uh oh!

alamb commented Jul 1, 2025

Uh oh!

adriangb commented Jul 1, 2025

Uh oh!

adriangb commented Jul 1, 2025

Uh oh!

Uh oh!

adriangb commented Jul 1, 2025

Uh oh!

alamb commented Jul 1, 2025

Uh oh!

Uh oh!

alamb commented Jun 22, 2025 •

edited

Loading

adriangb commented Jun 23, 2025 •

edited

Loading

adriangb commented Jun 29, 2025 •

edited

Loading