-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a race condition issue on reading spilled file #7538
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @sarutak . Very excellent debugging ❤️
I think this PR will result in DataFusion leaving many temporary directories around, which is likely not ideal for many users.
I am not sure if you have seen the fix for the test from @viirya in #7534
I have another solution but it breaks the API compatibility of DiskManager::create_tmp_file, which returns a pair of Arc and NamedTempFile to ensure TempDir lives long enough.
I think this is the solution that we should pursue as it is the easiest to use long term. Perhaps we can change the API to return something like
/// A named temporary file and the directory it lives in, which may be cleaned up on drop
struct RefCountedTempFile {
/// directory in which temporary files are created
temp_dir: Arc<TempDir>,
/// the temporary file
tempfile: NamedTempFile,
}
impl RefCountedTempFile {
pub fn path(&self) -> &Path {..}
pub fn inner(&self) -> &NamedTempFile {...}
...
}
impl DiskManager {
pub fn create_tmp_file(
&self,
request_description: &str
) -> Result<RefCountedTempFile, DataFusionError>
...
}
Another potential solution I thought of was to pass a reference to the DiskManager to read_spill_as_stream
to prevent the file from being deleted. However, that seems like a hard API to use and get right as all callers would have to know temp files were only scoped to the lifetime of the DiskManager
/// If `None` an error will be returned (configured not to spill) | ||
local_dirs: Mutex<Option<Vec<TempDir>>>, | ||
local_dirs: Mutex<Option<Vec<PathBuf>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I correct in understanding that the implication here is that the directories in local_dirs
will not be removed (though all the contents will be removed, when the NamedTempFile
are dropped)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This solution uses directories specified, or returned by std::env::temp_dir()
, which is usually /tmp
on Linux.
So, these directories are not deleted when the NamedTempFile
are dropped.
Thank you for the comment @alamb .
I've seen that PR but I don't think it's just an issue of the test.
This is what I implied. If we can change the API, I'll propose the second solution.
Yes, I also thought of this solution. But I thought it's too much. |
Suggested next actions:
|
@alamb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | ||
Ok(()) | ||
}) | ||
/// A named temporary file and the directory it lives in, which may be clean up on drop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// A named temporary file and the directory it lives in, which may be clean up on drop | |
/// A wrapper around a [`NamedTemporaryFile`] that also contains a reference | |
/// to the temporary directory it is in. The file is cleaned up on drop. |
/// A named temporary file and the directory it lives in, which may be clean up on drop | ||
#[derive(Debug)] | ||
pub struct RefCountedTempFile { | ||
/// directory in which temporary files are created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// directory in which temporary files are created | |
/// directory in which temporary files are created (Arc is held to ensure | |
/// it is not cleaned up prior to the NamedTempFile) |
} | ||
} | ||
|
||
/// A wrapper around a [`NamedTemporaryFile`] that also contains a reference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's NamedTempFile
, not NamedTemporaryFile
. I'll fix it.
Yea, as I commented in #7534 and @alamb's description in #7546, for DataFusion users this is not likely an issue except for the cases like the test that you construct physical query plan (like we did internally also) and execute it. So my idea was to propose a simplest fix to the test. As this doesn't look like a urgent bug to fix, I was thinking to find some time later to propose another fix to avoid |
} | ||
} | ||
|
||
/// A wrapper around a [`NamedTempFile`] that also contains a reference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// A wrapper around a [`NamedTempFile`] that also contains a reference | |
/// A wrapper around a [`NamedTempFile`] that also contains a reference to its parent temporary directory |
/// directory in which temporary files are created (Arc is held to ensure | ||
/// it is not cleaned up prior to the NamedTempFile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// directory in which temporary files are created (Arc is held to ensure | |
/// it is not cleaned up prior to the NamedTempFile) | |
/// The reference to the directory in which temporary files are created to ensure | |
/// it is not cleaned up prior to the NamedTempFile |
Yeah, thank you for recovering CI quickly! |
Thanks again @sarutak |
Which issue does this PR close?
Closes #7537,
Closes #7523
Closes #7546
Rationale for this change
This issue seems a potential race condition issue.
To improve the stability, this issue need to be fixed.
What changes are included in this PR?
This issue happens when the parent directory of spilled files are deleted before being read.
The root cause is
TempDir
of the parent directory can be dropped if this async block exits before the reading task starts like as follows.sorter
and it's members are dropped includingDiskManager::local_dirs
.TempFile
inlocal_dirs
are dropped then the corresponding temporary directories are deleted recursively.To avoid breaking API compatibility, this change proposes to not useTempDir
and let the DiskManager directly creates temporary spilled file inlocal_dirs
.UPDATE: After the discussion, I prefer the second solution below.
I have another solution but it breaks the API compatibility of
DiskManager::create_tmp_file
, which returns a pair ofArc<TempDir>
andNamedTempFile
to ensureTempDir
lives long enough.Spilled files are represented asNamedTempFile
and temporary files are deleted whenNamedTempFile::drop
is called.So, I think not using
TempDir
is not a problem.Are these changes tested?
Added new test.
Are there any user-facing changes?
No.