-
Notifications
You must be signed in to change notification settings - Fork 2
feat: add explicit durability flush after unmount (BLA-3202) #60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add blockdev --flushbufs operation on guest side after unmounting the sticky
disk to ensure data durability before Ceph RBD snapshots are taken.
Changes:
- Add getDeviceFromMount() to extract device path from mount point
- Add flushBlockDevice() that runs blockdev --flushbufs with stats logging
- Log I/O stats from /sys/block/{device}/stat before and after flush
- Add ENABLE_DURABILITY_FLUSH env var for feature flag (defaults to enabled)
- Handle errors gracefully - log warnings but don't fail the cleanup flow
Co-Authored-By: maru@blacksmith.sh <adityamaru@gmail.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
Co-Authored-By: maru@blacksmith.sh <adityamaru@gmail.com>
Co-Authored-By: maru@blacksmith.sh <adityamaru@gmail.com>
|
@devin fix the verify build output CI check |
|
The verify-build CI check is failing due to non-deterministic source map generation by ncc. The Options to fix this:
Would you like me to implement one of these solutions? |
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Changed all flush failure cases from core.info to core.warning - Added warning for stderr output even when exit code is 0 - This makes it easier to identify when the flush command fails Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Logs SOURCE, FSTYPE, and OPTIONS from findmnt (or full mount line) to help debug mount-related issues. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
Adds
blockdev --flushbufsoperation on the guest side after unmounting the sticky disk to ensure data durability before Ceph RBD snapshots are taken. This is the client-side portion of BLA-3202 for the setup-docker-builder action.Changes:
getDeviceFromMount()to extract device path from mount point (usesfindmntwith fallback to parsingmountoutput)flushBlockDevice()that runssudo blockdev --flushbufswith 30-second timeout/sys/block/{device}/statbefore and after flush for observabilityENABLE_DURABILITY_FLUSHenv var feature flag (defaults to enabled, set to"false"to disable)Review & Testing Checklist for Human
getDeviceFromMountfunction usesfindmnt -n -o SOURCEwith fallback to regex parsing ofmountoutput. Confirm this works correctly in the actual Blacksmith VM environment (device format may vary, e.g.,/dev/rbd0vs/dev/vdb).guest flush duration:log entries appear in workflow output with before/after stats.ENABLE_DURABILITY_FLUSH=falseto confirm flush is skipped.Recommended test plan:
guest flush duration:entriesENABLE_DURABILITY_FLUSH=falseenv var to confirm feature flag worksNotes
This is part of BLA-3202 which spans multiple repositories. Companion PRs exist for:
Link to Devin run: https://app.devin.ai/sessions/611301f918674712b016558ddc2fba0e
Requested by: @adityamaru
Note
Medium Risk
Touches post-action cleanup around unmounting and adds privileged
blockdevexecution; while best-effort and time-bounded, mis-detected devices or unexpected environments could cause skipped flushes or noisy warnings.Overview
Improves sticky-disk durability by determining the underlying device for
/var/lib/buildkitduring post-cleanup and issuing a best-effortsudo blockdev --flushbufsimmediately afterumount(while the device remains mapped).Adds
getDeviceFromMount()(prefersfindmnt, falls back to parsingmount) andflushBlockDevice()with a short timeout plus before/after/sys/block/.../statlogging for observability; flush failures/timeouts are logged as warnings and do not fail the action.Written by Cursor Bugbot for commit 576eaa6. This will update automatically on new commits. Configure here.