Description
Describe the bug
I noticed a test sort::tests::test_sort_fetch_memory_calculation
fails very rarely (approximately once in hundred times).
[2023-09-13T05:07:29Z ERROR datafusion::physical_plan::sorts::sort] Failure while reading spill file: NamedTempFile("/tmp/.tmp9wH6tP/.tmptarqTV"). Error: IO error: No such file or directory (os error 2)
test physical_plan::sorts::sort::tests::test_sort_fetch_memory_calculation ... FAILED
failures:
---- physical_plan::sorts::sort::tests::test_sort_fetch_memory_calculation stdout ----
Error: Execution("Spawned Task error: IO error: No such file or directory (os error 2)")
failures:
physical_plan::sorts::sort::tests::test_sort_fetch_memory_calculation
test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 1702 filtered out; finished in 1.04s
This seems a kind of race condition issue.
To Reproduce
You can reproduce this using this script.
for i in {1..500}; do
if ! cargo test sort::tests::test_sort_fetch_memory &> /tmp/err.out; then
cat /tmp/err.out; break;
fi
done
Or, insert a sleep to sort.rs
like as follows.
diff --git a/datafusion/core/src/physical_plan/sorts/sort.rs b/datafusion/core/src/physical_plan/sorts/sort.rs
index 82badb7d8..3feedfd71 100644
--- a/datafusion/core/src/physical_plan/sorts/sort.rs
+++ b/datafusion/core/src/physical_plan/sorts/sort.rs
@@ -616,6 +616,7 @@ fn read_spill_as_stream(
let sender = builder.tx();
builder.spawn_blocking(move || {
+ std::thread::sleep(std::time::Duration::from_secs(1));
let result = read_spill(sender, path.path());
if let Err(e) = &result {
error!("Failure while reading spill file: {:?}. Error: {}", path, e);
Expected behavior
No response
Additional context
No response