-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg/fs: Add RemoveAll function #261
Conversation
I still want to determine if long path support is needed here. However the code should be reasonable to review as it exists already. |
Looks like some overzealous linter errors. May need to tune those down. |
222373c
to
c8e6070
Compare
New RemoveAll function intended to be a replacement for os.RemoveAll usage and be tailored to Windows scenarios. Most importantly, this function works with []uint16 instead of string. This allows it to properly delete Windows filesystem trees that contain files with invalid UTF-16 paths. Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
// [1] This inaccurate version is still usable, as the name and alternate name fields are one character short, | ||
// but the original value is guaranteed to be null terminated, so you just lose the null terminator in the worst case. | ||
// However, this is still annoying to deal with, as you either must find the null in the array, or treat the full array | ||
// as a null terminated string. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this discussion right above findData
, since it explains why we arent using windows.Win32finddata
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted it here as a reference for why we define our own syscalls. I can look at what I could put in the other spot as well though.
// However, we don't attempt to handle every dynamic case, for instance: | ||
// - If a file has FILE_ATTRIBUTE_READONLY re-set on it after we clear it, we will fail. | ||
// - If an entry is changed from file to directory after we query its attributes, we will fail. | ||
func RemoveAll(path []uint16) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add an alias or new type UTF16Str
for []uint16, so in the future we can do our own validation and add helper functions (like filepath join and co) to work on windows path strings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered adding a type but wasn't sure how much utility it would have, or what a good name would be. UTF16Str
is exactly what we don't want since the whole point of this function is it can work with a name that is invalid UTF-16. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can think more about what a good name would be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe OSString
, to steal Rust's thunder?
} | ||
|
||
func join(parent, child []uint16) []uint16 { | ||
return append(append(parent, '\\'), child...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would cause duplicate \
's if parent already ends with one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me think about if there's a path where we would expect to need to handle that here. Maybe just the case where the user passes in a path that ends in a separator.
|
||
var ( | ||
// Mockable for testing. | ||
removeDirectory = windows.RemoveDirectory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never liked the pattern, and it feels un-go-like.
Is there any way around needing it?
The Go os repo has this testlog
call which logs what is being opened/accessed.
We can add something similar where we use a buffer/list or items being deleted, and then search through that to make sure what we want is deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any way to avoid it here, and don't think the cost is too high. We don't just need to log operations done, but actually need to inject a failure case into a Windows API call (look at TestRemoveAllShouldFailWhenSymlinkDeletionFails
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but rather than allowing the removeDirectory
function to be modified, we can do something like
type testHook func(p []uint16)
var removeDirectoryTestHook testHook
func removeDirectory(path []uint16) error {
if removeDirectoryTestHook != nil {
removeDirectoryTestHook(path)
}
windows.RemoveDirectory
}
And then TestRemoveAllShouldFailWhenSymlinkDeletionFails
becomes
var linkDeleteAttempted bool
removeDirectoryTestHook = func(path *uint16) {
if _, name := filepath.Split(windows.UTF16PtrToString(path)); !linkDeleteAttempted && name == "link" {
linkDeleteAttempted = true
}
}
if err := RemoveAll(rootU16); err != nil {
// ...
}
if linkDeleteAttempted {
t.Fatal("link deleted")
}
if err := func() error { | ||
var fd findData | ||
// findFirstFileExW allows us to avoid the shortname lookup. See comment at top of file for details. | ||
pattern := join(path, []uint16{'*'}) | ||
find, err := findFirstFileExW(terminate(pattern), findExInfoBasic, &fd, findExSearchNameMatch, nil, 0) | ||
if err == windows.ERROR_FILE_NOT_FOUND { | ||
// No children is weird, because we should always have "." and "..". | ||
// If we do hit this, just continue to the next deletion attempt. | ||
return nil | ||
} else if err != nil { | ||
return pathErr("FindFirstFileEx", pattern, err) | ||
} | ||
defer windows.FindClose(find) | ||
for { | ||
var child []uint16 | ||
child, err = truncAtNull(fd.name[:]) | ||
if err != nil { | ||
return err | ||
} | ||
if !equal(child, []uint16{'.'}) && !equal(child, []uint16{'.', '.'}) { // Ignore "." and ".." | ||
if err := removeAll(join(path, child), fd.attributes, fd.reserved0); err != nil { | ||
return err | ||
} | ||
} | ||
err = findNextFileW(find, &fd) | ||
if err == windows.ERROR_NO_MORE_FILES { | ||
break | ||
} else if err != nil { | ||
return pathErr("FindNextFile", path, err) | ||
} | ||
} | ||
return nil | ||
}(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be moved out to a separate function?
Ideally we can wrap findFirstFileExW
and findNextFileW
into a walkFiles(f func(findData) error)) error
, since that allows the defer to still work, and I am sure we'll need it again in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure here. I think it can be easier to grok the overall flow when it's kept together in a single function, and didn't want to introduce more abstractions like a walk function when the existing code seems fairly straightforward. However the complexity did grow over time, so maybe it would make sense.
For the record, Go 1.21 will obviate the need for this this by using WTF-8 to carry filenames on Windows, ensuring round-trip-safe |
Yeah, I think it probably makes sense to just wait for go1.21 at this point. Glad that change made it in. :) |
New RemoveAll function intended to be a replacement for os.RemoveAll usage and be tailored to Windows scenarios.
Most importantly, this function works with []uint16 instead of string. This allows it to properly delete Windows filesystem trees that contain files with invalid UTF-16 paths.
Signed-off-by: Kevin Parsons kevpar@microsoft.com