Skip to content

Backport "HBASE-29029 Refactor BackupHFileCleaner + fix test (#6533)" to branch-3 #7101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 16, 2025

Conversation

ndimiduk
Copy link
Member

The TestBackupHFileCleaner test was broken, as it used a different API to register bulk loads than the API that is actually used to register bulk loads during backups. The test also incorrectly closed the FS of the HBaseTestingUtil, causing this test to block for about 5 minutes during shutdown.

Both the test and BackupHFileCleaner itself were overly convoluted and are cleaned up. Methods in BackupSystemTable that could lead to incorrect use have been removed or deprecated (to fix their use case in HBASE-28715).

The TestBackupHFileCleaner test was broken, as it used
a different API to register bulk loads than the API that
is actually used to register bulk loads during backups.
The test also incorrectly closed the FS of the HBaseTestingUtil,
causing this test to block for about 5 minutes during shutdown.

Both the test and BackupHFileCleaner itself were overly convoluted
and are cleaned up. Methods in BackupSystemTable that could lead
to incorrect use have been removed or deprecated (to fix their
use case in HBASE-28715).

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
@ndimiduk ndimiduk added the backport This PR is a back port of some issue or issues already committed to master label Jun 13, 2025
@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 31s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ branch-3 Compile Tests _
+1 💚 mvninstall 3m 17s branch-3 passed
+1 💚 compile 0m 29s branch-3 passed
+1 💚 checkstyle 0m 11s branch-3 passed
+1 💚 spotbugs 0m 30s branch-3 passed
+1 💚 spotless 0m 50s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+1 💚 mvninstall 3m 0s the patch passed
+1 💚 compile 0m 28s the patch passed
+1 💚 javac 0m 28s hbase-backup generated 0 new + 124 unchanged - 1 fixed = 124 total (was 125)
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 9s /results-checkstyle-hbase-backup.txt hbase-backup: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
+1 💚 xmllint 0m 0s No new issues.
+1 💚 spotbugs 0m 34s the patch passed
+1 💚 hadoopcheck 12m 1s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 spotless 0m 43s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 9s The patch does not generate ASF License warnings.
30m 35s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7101/2/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #7101
Optional Tests dupname asflicense javac codespell detsecrets xmllint hadoopcheck spotless compile spotbugs checkstyle hbaseanti
uname Linux f99c62274367 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision branch-3 / 209b5f5
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 83 (vs. ulimit of 30000)
modules C: hbase-backup U: hbase-backup
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7101/2/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3 xmllint=20913
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 27s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ branch-3 Compile Tests _
+1 💚 mvninstall 2m 51s branch-3 passed
+1 💚 compile 0m 19s branch-3 passed
+1 💚 javadoc 0m 13s branch-3 passed
+1 💚 shadedjars 5m 50s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 2m 57s the patch passed
+1 💚 compile 0m 18s the patch passed
+1 💚 javac 0m 18s the patch passed
+1 💚 javadoc 0m 13s the patch passed
+1 💚 shadedjars 5m 47s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
-1 ❌ unit 13m 25s /patch-unit-hbase-backup.txt hbase-backup in the patch failed.
33m 23s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7101/2/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #7101
Optional Tests javac javadoc unit shadedjars compile
uname Linux 339269d80a02 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision branch-3 / 209b5f5
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7101/2/testReport/
Max. process+thread count 1017 (vs. ulimit of 30000)
modules C: hbase-backup U: hbase-backup
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7101/2/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@ndimiduk
Copy link
Member Author

@DieterDP-ng the new test seems to reliably fail on branch-3. I wonder if there was some drift between the last CI run and when I merged master, so maybe it's broken there as well. Mind taking a look? I'll dig a bit on Monday.

@DieterDP-ng
Copy link
Contributor

@DieterDP-ng the new test seems to reliably fail on branch-3. I wonder if there was some drift between the last CI run and when I merged master, so maybe it's broken there as well. Mind taking a look? I'll dig a bit on Monday.

Also the case on master. Managed to track it so far to the buffered mutator hanging on close. Doesn't seem related to anything backup related though. The following test case also blocks:

private final static HBaseTestingUtil TEST_UTIL = new HBaseTestingUtil();
  
  @BeforeClass
  public static void setUpBeforeClass() throws Exception {
    TEST_UTIL.startMiniCluster(1);
  }

  @AfterClass
  public static void tearDownAfterClass() throws Exception {
    TEST_UTIL.shutdownMiniCluster();
  }

  @Test
  public void testGetDeletableFiles() throws IOException {
    TableName table = TableName.valueOf("table1");
    TableDescriptor desc = TableDescriptorBuilder.newBuilder(table)
      .setColumnFamily(ColumnFamilyDescriptorBuilder.of("f")).build();
    Admin ha = TEST_UTIL.getAdmin();
    ha.createTable(desc);

    try (BufferedMutator bufferedMutator = TEST_UTIL.getConnection().getBufferedMutator(table)) {
      Put put = new Put(Bytes.toBytes("row1"));
      put.addColumn(Bytes.toBytes("f"), Bytes.toBytes("q"), Bytes.toBytes("value"));

      List<Put> puts = List.of(put);
      bufferedMutator.mutate(puts);
    }
  }

@DieterDP-ng
Copy link
Contributor

Found the root cause, it's unrelated to the changes in the PR. But I guess the test case brings it to light.

It's a deadlock situation:

  • Thread running the test is blocked in BufferedMutatorOverAsyncBufferedMutator#close (in #internalFlush), while holding the lock on the buffered mutator instance.
  • The thead that executed the mutations is trying to resolve the futures in BufferedMutatorOverAsyncBufferedMutator#mutate, but cannot because of the held lock.

@DieterDP-ng
Copy link
Contributor

@Apache9 This looks a change by you in HBASE-29394

@DieterDP-ng
Copy link
Contributor

Aha, the issue is already known and logged as HBASE-29397.

@ndimiduk ndimiduk merged commit fb60ebf into apache:branch-3 Jun 16, 2025
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport This PR is a back port of some issue or issues already committed to master
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants