Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading spilled file rarely fails #7537

Closed
sarutak opened this issue Sep 13, 2023 · 2 comments · Fixed by #7538 or #7534
Closed

Reading spilled file rarely fails #7537

sarutak opened this issue Sep 13, 2023 · 2 comments · Fixed by #7538 or #7534
Labels
bug Something isn't working

Comments

@sarutak
Copy link
Member

sarutak commented Sep 13, 2023

Describe the bug

I noticed a test sort::tests::test_sort_fetch_memory_calculation fails very rarely (approximately once in hundred times).

[2023-09-13T05:07:29Z ERROR datafusion::physical_plan::sorts::sort] Failure while reading spill file: NamedTempFile("/tmp/.tmp9wH6tP/.tmptarqTV"). Error: IO error: No such file or directory (os error 2)
test physical_plan::sorts::sort::tests::test_sort_fetch_memory_calculation ... FAILED

failures:

---- physical_plan::sorts::sort::tests::test_sort_fetch_memory_calculation stdout ----
Error: Execution("Spawned Task error: IO error: No such file or directory (os error 2)")


failures:
    physical_plan::sorts::sort::tests::test_sort_fetch_memory_calculation

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 1702 filtered out; finished in 1.04s

This seems a kind of race condition issue.

To Reproduce

You can reproduce this using this script.

for i in {1..500}; do
  if ! cargo test sort::tests::test_sort_fetch_memory &> /tmp/err.out; then
    cat /tmp/err.out; break;
fi
done

Or, insert a sleep to sort.rs like as follows.

diff --git a/datafusion/core/src/physical_plan/sorts/sort.rs b/datafusion/core/src/physical_plan/sorts/sort.rs
index 82badb7d8..3feedfd71 100644
--- a/datafusion/core/src/physical_plan/sorts/sort.rs
+++ b/datafusion/core/src/physical_plan/sorts/sort.rs
@@ -616,6 +616,7 @@ fn read_spill_as_stream(
     let sender = builder.tx();
 
     builder.spawn_blocking(move || {
+        std::thread::sleep(std::time::Duration::from_secs(1));
         let result = read_spill(sender, path.path());
         if let Err(e) = &result {
             error!("Failure while reading spill file: {:?}. Error: {}", path, e);

Expected behavior

No response

Additional context

No response

@sarutak sarutak added the bug Something isn't working label Sep 13, 2023
@viirya
Copy link
Member

viirya commented Sep 13, 2023

Duplicate to #7523

@sarutak
Copy link
Member Author

sarutak commented Sep 13, 2023

@viirya Thank you for letting me know this duplicates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants