Skip to content

Conversation

@aversecat
Copy link
Contributor

@aversecat aversecat commented Oct 16, 2024

This branch against main contains both the parallel restore library, tests, as well as the check code.

The check code has been reduced to our validation tool to identify inconsistencies in an umounted scoutfs meta device, only. All error injection and repair code has been omitted. The check code is used to validate the output of the parallel restore test cases.

  • - tests need to pass, but check fails. There either is a block reading problem in scoutfs check or there is an error inserting during restore.
  • - quota print outputs incorrect quota rules during test

New issues from @chaowang-versity:

  • - parallel_restore binary fails with more than 10 threads
  • - hardlinks are incorrect: After parallel_restore, hardlinked items have a hardlink count of 1 where that number should be larger than 1 for obvious reasons.
  • - util.h should be exported, because format.h uses DIV_ROUND_UP which is defined in it.

@aversecat aversecat added the enhancement New feature or request label Oct 16, 2024
@aversecat aversecat force-pushed the auke/restore_and_check branch from 67bbd3c to ae4b55a Compare October 17, 2024 20:37
@aversecat aversecat force-pushed the auke/restore_and_check branch from ae4b55a to a2364f2 Compare December 17, 2024 18:09
@aversecat aversecat changed the title Restore and Check. **WIP** Restore and Check. Dec 17, 2024
@aversecat aversecat force-pushed the auke/restore_and_check branch from a2364f2 to fac2fec Compare December 19, 2024 23:47
@aversecat
Copy link
Contributor Author

pushed el7 fixes into this.

@chaowang-versity chaowang-versity self-requested a review January 15, 2025 17:39
@zabbo
Copy link
Collaborator

zabbo commented Jan 21, 2025

Let's pull out the supporting commits that are fine on their own and don't really have much to do with parallel restore. We can land them in their own PR and then focus on fixing up the parallel restore commits. I think the following should do it:
bb2003c Fix printing alloc list block extents
0d156e0 Import a few more functions to our list.h
3f3c2d8 Add userspace NSEC_PER_SEC
9cbe041 Add bloom filter index calc for userspace utils
fcdefb7 Add srch_encode_entry() for userspace utils
b3a8380 Add put_unaligned_leXX() for userspace
ad8a0f7 Add fls64() alias for userspace flsll()
9524397 Promote userspace btree block initialization
cd43c39 Add userspace version of our mode to type
29ca2ad Add userspace version of our dirent name hash
0020eac Add lk rbtree wrapper
1f7f40c Add test_bit to utils bitmap

@aversecat
Copy link
Contributor Author

aversecat commented Jan 22, 2025 via email

@aversecat aversecat force-pushed the auke/restore_and_check branch from 92c433c to edacd17 Compare February 10, 2025 17:56
@aversecat
Copy link
Contributor Author

rebased onto main

@aversecat
Copy link
Contributor Author

This is failing tests on scoutfs df output. We'll need to scrub it or always force meta/data sizing, otherwise it'll always diff.

@zabbo
Copy link
Collaborator

zabbo commented Feb 18, 2025

This is failing tests on scoutfs df output. We'll need to scrub it or always force meta/data sizing, otherwise it'll always diff.

Yeah, the test shouldn't be letting the raw df output through to be compared.

@aversecat aversecat force-pushed the auke/restore_and_check branch from edacd17 to 0f9c9f0 Compare March 7, 2025 05:52
@aversecat
Copy link
Contributor Author

rebased onto main. added nlink fix. Still needs df test output scrub fix.

zabbo and others added 14 commits May 27, 2025 14:26
Signed-off-by: Zach Brown <zab@versity.com>
Signed-off-by: Auke Kok <auke.kok@versity.com>
As I was committing the initial check command I had only partially
completed a rename of the function that checks the metadata allocators.

Signed-off-by: Zach Brown <zab@versity.com>
Signed-off-by: Zach Brown <zab@versity.com>
Generally as we call block_get() we should validate that if the block
has a hdr, at a minimum the crc is correct and the magic value is
the expected value passed, and the fsid matches the superblock. This
function implements just that. Returns -EINVAL, up to the caller to
report a problem() and handle the outcome. For now the code just hard
fails, which incedentally makes it fail the clobber-repair.sh tests
I wrote.

Signed-off-by: Auke Kok <auke.kok@versity.com>
Adds basic man page content for the `check` subcommand.

Signed-off-by: Auke Kok <auke.kok@versity.com>
We check superblock magic, crc, flags. data device superblock is
checked but a little less thorough.  We check whether the device is
still mounted, since that would make checking invalid to begin with.
Quorum blocks are validated to have sane contents.

We add a global problem counter so we can trivially measure and
report whether any problem was found at all, instead of iterating
over all the problems and checking each individual count.

We pick the standard exit code values from `fsck` and mirror their
intentional behavior. This results in `fsck.scoutfs` can now be
trivially created by making it a wrapper around `scoutfs check`.

Signed-off-by: Auke Kok <auke.kok@versity.com>
Signed-off-by: Hunter Shaffer <hunter.shaffer@versity.com>
Signed-off-by: Zach Brown <zab@versity.com>
Signed-off-by: Hunter Shaffer <hunter.shaffer@versity.com>
Signed-off-by: Auke Kok <auke.kok@versity.com>
This is the benchmark binary that bulk creates filesystem items, xattrs
and is heavily threaded to scope the performance of the library. The
test script invokes it to validate some basic constraints.

Signed-off-by: Zach Brown <zab@versity.com>
Signed-off-by: Hunter Shaffer <hunter.shaffer@versity.com>
Signed-off-by: Auke Kok <auke.kok@versity.com>
This tool compies a source tree (whether it's scoutfs or not)
into an offline scoutfs meta device. It has only those 2 parameters
and does a single-process walk of the tree to restore all items
while preservice as much of the metadata as possible.

Signed-off-by: Hunter Shaffer <hunter.shaffer@versity.com>
Signed-off-by: Auke Kok <auke.kok@versity.com>
The hardlink count of files was previously hard coded to 1. We want to
properly restore hard linked files because it saves space and time.

The test binary restore_copy exposed this missed case before and is updated
to make use of it.

Signed-off-by: Auke Kok <auke.kok@versity.com>
When initializing the key for the quota we were originally given
the address of the pointer to the rule, that is fixed here.
There is also a test case verifying that we are able to perform
operations such as rule deletion and adding a rule to the restored
filesystem.

Signed-off-by: Hunter Shaffer <hunter.shaffer@versity.com>
The tests scripts for restore_copy and parallel_restore were diffing
because of the scoutfs df output. This happened because the fields
other than Used would be dependent on the disk size used. This patch
fixes this by limiting the output to only the type and space used.
Signed-off-by: Hunter Shaffer <hunter.shaffer@versity.com>
We didn't migrate the extra data from inodes on folders before,
which is a gap in testing. Make sure to test with a nested restored
folder to test that inheritance isn't in the way.

Signed-off-by: Auke Kok <auke.kok@versity.com>
While doing this I noticed we attempt to restore data/meta_seq but
that goes nowhere, it's just ignored.

Signed-off-by: Auke Kok <auke.kok@versity.com>
While we are filling blocks the final block may not have enough items
to properly fill that block. Here we add a check that stops filling
the block if we have less than the minimum amount of items.

Signed-off-by: Hunter Shaffer <hunter.shaffer@versity.com>
@aversecat aversecat force-pushed the auke/restore_and_check branch from baa18bd to 7b121d9 Compare May 27, 2025 21:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants