Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checks for problems in Accumulo #4957

Open
wants to merge 2 commits into
base: 3.1
Choose a base branch
from

Conversation

kevinrr888
Copy link
Member

This PR:

  • Moves existing checks (checkTablets and the fate check for dangling locks) into the appropriate new admin check command
  • Adds new checks
  • New tests in AdminCheckIT
  • SYSTEM_CONFIG now checks for
    • valid locked table/namespace ids (the locked table/namespaces exist)
    • locked table/namespaces are associated with a fate op
  • ROOT_METADATA now checks for
    • offline tablets
    • missing "columns"
    • invalid "columns"
  • ROOT_TABLE now checks for
    • offline tablets
    • tablets for metadata table have no holes, valid (null) prev end row for first tablet, and valid (null) end row for last tablet
    • missing columns
    • invalid columns
  • METADATA_TABLE now checks for
    • offline tablets
    • tablets for user tables (and scanref) have no holes, valid (null) prev end row for first tablet, and valid (null) end row for last tablet
    • missing columns
    • invalid columns
  • SYSTEM_FILES now checks for
    • missing system files
  • USER_FILES now checks for
    • missing user files

There are still quite a few checks that need to be added (mentioned in #4687) and probably more. This is a first/starting PR for these checks. More checks will be added in follow-ons. Something else still left todo are tests for these checks for FAILING cases.

Part of #4892

This commit:
- Moves existing checks (`checkTablets` and the fate check for dangling locks) into the appropriate new `admin check` command
- Adds new checks
- New tests in AdminCheckIT
- SYSTEM_CONFIG now checks for
	- valid locked table/namespace ids (the locked table/namespaces exist)
	- locked table/namespaces are associated with a fate op
- ROOT_METADATA now checks for
	- offline tablets
	- missing "columns"
	- invalid "columns"
- ROOT_TABLE now checks for
	- offline tablets
	- tablets for metadata table have no holes, valid (null) prev end row for first tablet, and valid (null) end row for last tablet
	- missing columns
	- invalid columns
- METADATA_TABLE now checks for
	- offline tablets
	- tablets for user tables (and scanref) have no holes, valid (null) prev end row for first tablet, and valid (null) end row for last tablet
	- missing columns
	- invalid columns
- SYSTEM_FILES now checks for
	- missing system files
- USER_FILES now checks for
	- missing user files

Part of apache#4892
@kevinrr888 kevinrr888 self-assigned this Oct 8, 2024
@kevinrr888 kevinrr888 added this to the 3.1.0 milestone Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant