-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
occasional file i/o "access is denied" errors during pcap ingest in windows CI with brim #613
Comments
Happened here https://github.com/brimsec/brim/runs/637035303?check_suite_focus=true . Confirmed with screen shot. The zq hash is e7b866d which is an experimental branch. No changes were made to zq itself on this branch. The parent hash from master is e40934a |
thanks for pointer @mikesbrown - I did pull the archive for that test run. The logs have very little data to help with the issue, I may need to add some to help triage. |
https://github.com/brimsec/brim/actions/runs/95611919 Looks like this can now manifest with a user-facing message like this:
In
This was zq hash 335d284. One thing I noticed is that |
I believe this is because the rename operation occurs while a search is active on the space. At the filesystem level, it means the rename ( The current "best" way to get Posix like semantics for opens not interfering with renames is to essentially copy all of the code from stdlib involved in file opens, with the added ability to specify FILE_SHARE_DELETE. You can see an example of that in moby/moby (basis of Docker): Alternatively, we could protect against this above the filesystem, with reader/writer protection against overwriting the all.zng file. @mattnibs is currently working on revamping the storage interface to spaces, which will centralize access to the file, where we could implement the protection scheme. However, since this will likely not be the last time we expect Windows file access to have these semantics, I lean towards taking the copy-all-os-Open approach, and making sure that all file opens in zq go through that code going forward. |
Address the occasional rename related failures on Windows by using the FILE_SHARE_DELETE flag when opening files on Windows. As noted in #613, this unfortunately requires copying substantial code from the os package, as there's no way to use that flag with the existing os.Open api.
We've seen another instance of this after #741 was committed. |
Just an update: I've been trying to reproduce this with a brim/zq branch with more logging information, and I can't get to repro on that test branch. Since the new mingw Zeek artifact for windows, it also seems like the frequency of occurrence has gone down, but not decreased. Here's a recent example of it: |
A few enhancements in our fs package: add a utility function to use when atomically replacing the contents for a file. This is based on a function NewAtomicFileWriter in the https://github.com/moby/moby repo; I'll add an acknowledgements entry for the (apache licensed) code if this goes in. add a fs version of 'ioutil.ReadFile' that uses fs.Open to get the rename semantics we want on Windows. This also updates the filestore code to use these & other calls from package fs. There is a possibility that this resolves #613 , though that's been impossible to repro on demand. Part of brimdata/zui#883 .
Just FYI: this bug repro'd very sporadically, so even though I think #925 addresses it, it's hard to be sure. |
…y alfred-landrum This is an auto-generated commit with a zq dependency update. The zq PR brimdata/super#925, authored by @alfred-landrum, has been merged. add and use a file replacement utility function A few enhancements in our fs package: * add a utility function to use when atomically replacing the contents for a file. This is based on a function `NewAtomicFileWriter` in the https://github.com/moby/moby repo; I'll add an acknowledgements entry for the (apache licensed) code if this goes in. * add a fs version of 'ioutil.ReadFile' that uses fs.Open to get the rename semantics we want on Windows. This also updates the filestore code to use these & other calls from package fs. There is a possibility that this resolves brimdata/super#613 , though that's been impossible to repro on demand. Part of #883 .
Several tests for Brim / zq integration involve uploading a pcap through Brim and then doing something with it. In one run, there were i/o errors that prevented ingest from succeeding.
One was
The other was
The bug is filed here as the error is happening in
zqd
and being propagated to Brim.The file being ingested is sample.pcap from brim.git.
The logs of the run are available here.
brim was at brimdata/zui@1890b07 . zq was at tag v0.11.1. Note that windows integration testing is quite new; we don't have many past data points. But so far this is the only known instance.
The text was updated successfully, but these errors were encountered: