-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Describe the feature would like to see added to OpenZFS
As referenced in the manual nop-write is enabled only when a strong cryptographic algorithm for block hashing is used
The rationale is good, with a weak algorithm there is a higher probability of a hash collision, so a hash check is not a strong indicator of the underlying data to be the same
What I'm proposing it to extend the module configuration, something like zfs_nopwrite_enabled=2 (with value = 2, similar to MRUinL2) , so that when a weak hash algorithm is used (like fletcher4 that is the default ) the hash is check and if a collision is found the actual data is compared, if both hash and data are the same the write become a nop-write
How will this feature improve OpenZFS?
This will allow space saving (and better write performance) if the same file is copied over (on top of himself, the scope of this is different from BRT) multiple time and you have snapshot enabled (as many do)
Currently you can have this behaviour only if either Dedup or a strong-cryptographic-hash is used and both modes can became memory and compute expensive
Additional context
As far I understand this will be a non-breaking change and should be fully backward and forward compatible
On the verify part this already happen in the Dedup-Pipeline if '<hash_algorith>,verify' is used
In my humble opinion this can have major space saving and write performance gain (in the specific case of file being overwritten of course) with a trivial and non-breaking change implementation
As the configurability of it another option it to be symmetric to how Dedup is configured with ,verify appended to the chosen algorithm (checksum=<hash_algorith>,verify) becoming a per-dataset parameter like dedup=<hash_algorith>,verify (this also imply that you can force data verification on strong hash algorithms as Dedup can do, but I don't know if a new per-dataset parameter would change the on-disk-format)
And thank you for all the hard-work you put in this project for the last so-many-years