-
Notifications
You must be signed in to change notification settings - Fork 255
btrfs-progs: new --inode-flags option #987
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
adam900710
wants to merge
11
commits into
kdave:devel
Choose a base branch
from
adam900710:mkfs_nocow
base: devel
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[BUG] When a device replace failed, e.g. try to replace a device on a RO mounted btrfs, the error message is incorrectly broken into two lines: [adam@btrfs-vm ~]$ sudo btrfs replace start -fB 1 /dev/test/scratch3 /mnt/btrfs/ Performing full device TRIM /dev/mapper/test-scratch3 (10.00GiB) ... ERROR: ioctl(DEV_REPLACE_START) failed on "/mnt/btrfs/": Read-only file system [adam@btrfs-vm ~]$ Note the newline after the "Read-only file system" error message. [CAUSE] Inside cmd_replace_start(), if the ioctl failed we need to handle the error messages different depeneding on start_args.result. If the result is not BTRFS_IOCTL_DEV_REPLACE_RESULT_NO_RESULT we will append extra info to the error message. But the initial error message is using error(), which implies a newline. This results the above incorrectly splitted error message. [FIX] Instead of manually appending an extra reason to the existing error message, just do different output depending on the start_args.result in the first place. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com>
It's a long known problem that direct IO can lead to data checksum mismatches if the user space is also modifying its buffer during writeback. Although it's fixed in the recent v6.15 release (and backported to older kernels), we still need a user friendly way to fix those problems. This patch introduce the dryrun version of "btrfs rescue fix-data-checksum", which reports the logical bytenr and corrupted mirrors. Signed-off-by: Qu Wenruo <wqu@suse.com>
[BUG] There is a seldomly utilized function, btrfs_find_item(), which has no document and is not behaving correctly. Inside backref.c, iterate_inode_refs() and btrfs_ref_to_path() both utilize this function, to find the parent inode. However btrfs_find_item() will never return 0 if @ioff is passed as 0 for such usage, result early failure for all kinds of inode iteration functions. [CAUSE] Both functions pass 0 as the @ioff parameter initially, e.g: We have the following fs tree root: item 0 key (256 INODE_ITEM 0) itemoff 16123 itemsize 160 generation 3 transid 9 size 6 nbytes 16384 block group 0 mode 40755 links 1 uid 0 gid 0 rdev 0 sequence 1 flags 0x0(none) item 1 key (256 INODE_REF 256) itemoff 16111 itemsize 12 index 0 namelen 2 name: .. item 2 key (256 DIR_ITEM 2507850652) itemoff 16078 itemsize 33 location key (257 INODE_ITEM 0) type FILE transid 9 data_len 0 name_len 3 name: foo item 3 key (256 DIR_INDEX 2) itemoff 16045 itemsize 33 location key (257 INODE_ITEM 0) type FILE transid 9 data_len 0 name_len 3 name: foo item 4 key (257 INODE_ITEM 0) itemoff 15885 itemsize 160 generation 9 transid 9 size 16384 nbytes 16384 block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0 sequence 4 flags 0x0(none) item 5 key (257 INODE_REF 256) itemoff 15872 itemsize 13 index 2 namelen 3 name: foo item 6 key (257 EXTENT_DATA 0) itemoff 15819 itemsize 53 generation 9 type 1 (regular) extent data disk byte 13631488 nr 16384 extent data offset 0 nr 16384 ram 16384 extent compression 0 (none) Then we call paths_from_inode() with: - @Inum = 257 - ipath = {.fs_root = 5} Then we got the following sequence: iterate_irefs(257, fs_root, inode_to_path) |- iterate_inode_refs() |- inode_ref_info() |- btrfs_find_item(257, 0, fs_root) | Returned 1, with @found_key updated to | (257, INODE_REF, 256). This makes iterate_irefs() exit immediately, but obviously that btrfs_find_item() call is to find any INODE_REF, not to find the exact match. [FIX] If btrfs_find_item() found an item matching the objectid and type, then it should return 0 other than 1. Fix it and keep the behavior the same across btrfs-progs and the kernel. Since we're here, also add some comments explaining the function. Signed-off-by: Qu Wenruo <wqu@suse.com>
Previously "btrfs rescue fix-data-checksum" only show affected logical bytenr, which is not helpful to determine which files are affected. Enhance the output by also outputting the affected subvolumes (in numeric form), and the file paths inside that subvolume. The example looks like this: logical=13631488 corrtuped mirrors=1 affected files: (subvolume 5)/foo (subvolume 5)/dir/bar logical=13635584 corrtuped mirrors=1 affected files: (subvolume 5)/foo (subvolume 5)/dir/bar Although the end result is still not perfect, it's still much easier to find out which files are affected. Signed-off-by: Qu Wenruo <wqu@suse.com>
This mode will ask user for how to fix each block. User input can match the first letter or the whole action name to specify given action, the input is verified case insensitive. If no user input is provided, the default action is to ignore the corrupted block. If the input matches no action, a warning is outputted and user must retry until a valid input is provided. Signed-off-by: Qu Wenruo <wqu@suse.com>
This adds a new group of action in the interactive mode to fix a csum mismatch. The output looks like this: logical=13631488 corrtuped mirrors=1 affected files: (subvolume 5)/foo (subvolume 5)/dir/bar <<I>>gnore/<1>:1 Csum item for logical 13631488 updated using data from mirror 1 In the interactive mode, the update-csum-item behavior is outputted as all available mirror numbers. Considering all the existing (and future) action should starts with an alphabet, it's pretty easy to distinguish mirror number from other actions. The update-csum-item action itself is pretty straight-forward, just read out the data from specified mirror, then calculate a new checksum, and update the corresponding csum item in csum tree. Signed-off-by: Qu Wenruo <wqu@suse.com>
This option allows "btrfs rescue fix-data-checksum" to use the specified mirror number to update checksum item for all corrupted mirrors. If the specified number is larger than the max number of mirrors, the real mirror number will be `num % (num_mirrors + 1)`. Signed-off-by: Qu Wenruo <wqu@suse.com>
Inside btrfs-progs (mostly 'mkfs/rootdir.c') new inodes are created in a different way than the kernel: - A new orphan inode item is created Without knowing the parent inode, thus it will always have the default flags (0). - Link the new inode to a parent Meanwhile inside the kernel, we know exactly the parent inode at new inode creation time, and can inherit the inode flags (NODATACOW/NODATASUM/COMPRESS/etc). Address the missing ability by: - Inherit the parent inode flags when linking an orphan inode The function btrfs_add_link() is called when linking an inode. It can be called to creating the initial link if it's a new and orphan inode. It can also be called to creating extra hard links. If the inode is already orphan, we know it's newly created and should inherit the inode flag from the parent. With this new ability, it will be much easier to implement new per-inode flags (like NODATACOW/NODATASUM) and get them properly passed down. Signed-off-by: Qu Wenruo <wqu@suse.com>
NODATACOW or NODATASUM Currently mkfs.btrfs --rootdir is implying data checksum, but soon we will support per-inode NODATACOW|NODATASUM flags. To support per-inode NODATACOW|NODATASUM flags: - Avoid compression if the inode has either NODATACOW|NODATASUM flag - Do not generate data checksum if the inode has either NODATACOW|NODATASUM flag. Both behaviors are the following the kernel ones. Signed-off-by: Qu Wenruo <wqu@suse.com>
This new option allows end users to specify certain per-inode flags for specified file/directory inside rootdir. And mkfs will follow the kernel behavior by inheriting the inode flag from the parent. For example: rootdir |- file1 |- file2 |- dir1/ | |- file3 |- subv/ << will be created as a subvolume using --subvol option |- dir2/ | |- file4 |- file5 When `mkfs.btrfs --rootdir rootdir --subvol subv --inode-flags nodatacow:dir1 --inode-flags nodatacow:subv", then the following files and directory will have *nodatacow* flag set: - dir1 - file3 - subv - dir2 - file4 - file5 For now only two flags are supported: - nodatacow Disable data COW, implies *nodatasum* for regular files - nodatasum Disable data checksum only. This also works with --compress option, and files with nodatasum or nodatacow flag will skip compression. Signed-off-by: Qu Wenruo <wqu@suse.com>
The simple test will create a layout like the following: rootdir |- file1 |- file2 |- subv/ << Regular subvolume | |- file3 |- nocow_subv/ << NODATACOW subvolume | |- file4 |- nocow_dir/ << NODATACOW directory | |- dir2 | | |- file5 | |- file6 |- nocow_file1 << NODATACOW file Any files under NODATACOW subvolume/directory should also be NODATACOW. The explicitly specified single file should also be NODATACOW. Signed-off-by: Qu Wenruo <wqu@suse.com>
@@ -1686,6 +1764,7 @@ | |||
|
|||
g_trans = trans; | |||
g_subvols = subvols; | |||
g_inode_flags_list = inode_flags_list; |
Check warning
Code scanning / CodeQL
Local variable address stored in non-local memory Warning
A stack address which arrived via a may be assigned to a non-local variable.
parameter
Error loading related location
Loading
This was referenced May 23, 2025
ef43ce6
to
3eff852
Compare
Something missing is the help entry: OPTLINE("--inode-flags FLAGS:PATH", "specify that path to have inode flags, other than the default one, can be specified multiple times"),
OPTLINE("", "FLAGS is one of:"),
OPTLINE("", "- nodatacow - disable data CoW, implies nodatasum for regular files"),
OPTLINE("", "- nodatasum - disable data checksum only"), |
Thanks a lot, a new fix is sent. And if David is fine with the fix, it can be folded into the original patch (which is still only in devel branch). |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The new --inode-flags option allows us to specify certain btrfs specific
flags to each inode.
Currently we only support nodatacow and nodatasum.
But in the future compression flag can also be added, allowing more
accurate per-file compression.
Furthermore child inodes will inherit the flag from their parents,
meaning one only needs to specify the flag to the parent directory, then
all children files/directories will have the flag.
This new option also works well with --subvol, although one has to
note that, the inode flag inheritance does not cross subvolume boundary
(the same as the kernel).
Finally, nodatacow and nodatasum will disable compression, just like the
kernel.
I hope this will helps projects like
mkosi
andyocto
to build theirimages with more btrfs specific flags.
The only downside for now is the memory usage, both
--subvol
and--inode-flags
will take over 8K for each specification.So one should not specify them too many times.