Skip to content

Conversation

@viirya
Copy link
Member

@viirya viirya commented Sep 12, 2023

Which issue does this PR close?

Closes #7537,
Closes #7523

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added the core Core DataFusion crate label Sep 12, 2023
);

let result = collect(sort_exec.clone(), task_ctx).await?;
let result = collect(sort_exec.clone(), task_ctx.clone()).await?;
Copy link
Member Author

@viirya viirya Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

task_ctx is simply moved into task_ctx. Before the stream future is resolved, it could be dropped early which makes DiskManager dropped too. That is why the spill file existed before proceeding into read_spill_as_stream but was deleted intermittently before reading.

@viirya
Copy link
Member Author

viirya commented Sep 12, 2023

cc @alamb

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @viirya -- I agree this solves the problem for this test. Thank you 🙏

However, it does seem like a real problem that the ExternalSorter can have its temporary files deleted out from under it. I filed #7546 to track that.

See also #7538 (comment) for more discussion

@alamb alamb changed the title Fix flaky test_sort_fetch_memory_calculation Fix flaky test_sort_fetch_memory_calculation test Sep 13, 2023
@alamb alamb merged commit 4fac0e1 into apache:main Sep 13, 2023
@viirya
Copy link
Member Author

viirya commented Sep 13, 2023

Thanks @alamb.

Yea, I noticed the issue. I think for DataFusion itself usage, e.g., through DataFrame API, the TaskContext is kept in the session so normally the temporary dirs won't be deleted too early.

Only problematic cases are when you construct physical query plan (like we did internally) like this test. I think most users are not in this cases. So I didn't put a fix immediately for this but propose a quick fix like this.

And yes, I saw #7538 was proposed last night. I will take a look too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reading spilled file rarely fails Intermittent Windows CI failure in "physical_plan::sorts::sort::tests::test_sort_fetch_memory_calculation"

2 participants