Skip to content

HBASE-27781 Fix case of action counter assertion error in handling of batch operation timeout exceeded #6144

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

droudnitsky
Copy link
Contributor

@droudnitsky droudnitsky commented Aug 8, 2024

https://issues.apache.org/jira/browse/HBASE-27781

In AsyncFutureRequestImpl we fail fast when operation timeout is exceeded during location resolution here. In that handling, we loop all actions and set them as failed. The problem is, some number of actions may already finished when we get to this spot. So the actionsInProgress would have been decremented for those already, and now we're going to decrement by all actions. This causes an assertion error since we go negative here

My understanding of AsynRequestFutureImpl is that the only actions passed to groupAndSendMultiAction which may have been completed already and we already decremented action counter for by the time we get to the operation timeout exceeded handling in that method are actions which were failed locally in groupAndSendMultiAction due to location error - the findAllLocationsOrFail method sets the action error and decrements the action counter on location resolution failure - we just need to avoid setting the error again on actions where location resolution failed when we get to this loop to avoid decrementing twice for those actions.

For replica calls, manageLocationError/manageError is responsible for result/error setting and preventing double action counter decrementing.

Test case reproduces the assertion error when one action in the batch fails on location error and then operation timeout is exceeded.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@droudnitsky
Copy link
Contributor Author

Two hbase-server tests that are not relevant to my changes are timing out :
precommit checks / yetus jdk8 Hadoop2 checks / (?) – org.apache.hadoop.hbase.replication.TestReplicationMetricsforUI
12m 29s
precommit checks / yetus jdk17 hadoop3 checks / (?) – org.apache.hadoop.hbase.wal.TestWALSplitCompressed
12m 39s

pushed empty commit to retrigger jenkins tests

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@droudnitsky droudnitsky force-pushed the branch-2-HBASE-27781- branch from da48212 to a483bfe Compare August 10, 2024 18:06
@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@droudnitsky
Copy link
Contributor Author

droudnitsky commented Aug 11, 2024

In the case of AsyncRequestFutureImpl with null results/no result tracking , I cannot see a way of checking if a given action was completed through an external method like isActionComplete, I think my current approach would have some issues in certain scenarios when results are null, need to understand AsyncRequestFutureImpl, maybe we can keep track of the actions failed locally in groupAndSendMultiAction as we loop over currentActions and avoid failing a done action a second time there if operation timeout is exceeded.

@droudnitsky droudnitsky marked this pull request as draft August 12, 2024 01:03
@droudnitsky
Copy link
Contributor Author

I think keeping track of the actions failed locally in groupAndSendMultiAction as we loop over currentActions and avoiding failing an already locally failed action a second time if operation timeout is exceeded there should work for both null results and with replica actions. Working on those changes.

@droudnitsky droudnitsky marked this pull request as ready for review September 30, 2024 14:16
@droudnitsky
Copy link
Contributor Author

@bbeaudreault may I kindly ask you to review this fix if you have the time

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@droudnitsky droudnitsky force-pushed the branch-2-HBASE-27781- branch from b61e51e to 2838c81 Compare October 8, 2024 17:47
@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@droudnitsky droudnitsky force-pushed the branch-2-HBASE-27781- branch from 2838c81 to e379b25 Compare January 7, 2025 10:31
@Apache-HBase

This comment has been minimized.

@droudnitsky droudnitsky force-pushed the branch-2-HBASE-27781- branch 2 times, most recently from e379b25 to 225e68c Compare April 17, 2025 20:44
@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@droudnitsky droudnitsky force-pushed the branch-2-HBASE-27781- branch from 225e68c to 6caae7a Compare April 18, 2025 12:36
@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 45s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ branch-2 Compile Tests _
+0 🆗 mvndep 0m 10s Maven dependency ordering for branch
+1 💚 mvninstall 3m 10s branch-2 passed
+1 💚 compile 3m 45s branch-2 passed
+1 💚 checkstyle 0m 54s branch-2 passed
+1 💚 spotbugs 2m 24s branch-2 passed
+1 💚 spotless 0m 46s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 15s Maven dependency ordering for patch
+1 💚 mvninstall 3m 3s the patch passed
+1 💚 compile 3m 41s the patch passed
+1 💚 javac 3m 41s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 16s hbase-client: The patch generated 0 new + 11 unchanged - 1 fixed = 11 total (was 12)
+1 💚 checkstyle 0m 38s The patch passed checkstyle in hbase-server
+1 💚 spotbugs 2m 44s the patch passed
+1 💚 hadoopcheck 16m 53s Patch does not cause any errors with Hadoop 2.10.2 or 3.3.6 3.4.0.
+1 💚 spotless 0m 44s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 19s The patch does not generate ASF License warnings.
42m 29s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6144/9/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #6144
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux 652fdf648b9f 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision branch-2 / 6caae7a
Default Java Eclipse Adoptium-11.0.23+9
Max. process+thread count 76 (vs. ulimit of 30000)
modules C: hbase-client hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6144/9/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 31m 26s Docker mode activated.
-0 ⚠️ yetus 0m 15s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ branch-2 Compile Tests _
+0 🆗 mvndep 6m 15s Maven dependency ordering for branch
+1 💚 mvninstall 15m 30s branch-2 passed
+1 💚 compile 5m 3s branch-2 passed
+1 💚 javadoc 1m 50s branch-2 passed
+1 💚 shadedjars 10m 36s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 59s Maven dependency ordering for patch
+1 💚 mvninstall 5m 43s the patch passed
+1 💚 compile 1m 45s the patch passed
+1 💚 javac 1m 45s the patch passed
+1 💚 javadoc 1m 20s the patch passed
+1 💚 shadedjars 8m 18s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 9m 33s hbase-client in the patch passed.
-1 ❌ unit 24m 34s /patch-unit-hbase-server.txt hbase-server in the patch failed.
125m 4s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6144/9/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #6144
Optional Tests javac javadoc unit compile shadedjars
uname Linux bd5b6e4af360 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision branch-2 / 6caae7a
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6144/9/testReport/
Max. process+thread count 1865 (vs. ulimit of 30000)
modules C: hbase-client hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6144/9/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 9m 41s Docker mode activated.
-0 ⚠️ yetus 0m 8s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ branch-2 Compile Tests _
+0 🆗 mvndep 0m 14s Maven dependency ordering for branch
+1 💚 mvninstall 4m 5s branch-2 passed
+1 💚 compile 1m 37s branch-2 passed
+1 💚 javadoc 1m 10s branch-2 passed
+1 💚 shadedjars 8m 51s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 31s Maven dependency ordering for patch
+1 💚 mvninstall 4m 56s the patch passed
+1 💚 compile 2m 1s the patch passed
+1 💚 javac 2m 1s the patch passed
+1 💚 javadoc 1m 9s the patch passed
+1 💚 shadedjars 7m 53s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 9m 5s hbase-client in the patch passed.
-1 ❌ unit 270m 50s /patch-unit-hbase-server.txt hbase-server in the patch failed.
327m 37s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6144/9/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile
GITHUB PR #6144
Optional Tests javac javadoc unit compile shadedjars
uname Linux 137e8f2db618 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision branch-2 / 6caae7a
Default Java Temurin-1.8.0_412-b08
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6144/9/testReport/
Max. process+thread count 4266 (vs. ulimit of 30000)
modules C: hbase-client hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6144/9/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 21m 8s Docker mode activated.
-0 ⚠️ yetus 0m 6s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ branch-2 Compile Tests _
+0 🆗 mvndep 1m 36s Maven dependency ordering for branch
+1 💚 mvninstall 6m 36s branch-2 passed
+1 💚 compile 3m 37s branch-2 passed
+1 💚 javadoc 1m 48s branch-2 passed
+1 💚 shadedjars 13m 2s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 29s Maven dependency ordering for patch
+1 💚 mvninstall 4m 32s the patch passed
+1 💚 compile 1m 13s the patch passed
+1 💚 javac 1m 13s the patch passed
+1 💚 javadoc 0m 56s the patch passed
+1 💚 shadedjars 6m 39s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 8m 20s hbase-client in the patch passed.
-1 ❌ unit 248m 30s /patch-unit-hbase-server.txt hbase-server in the patch failed.
326m 50s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6144/9/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR #6144
Optional Tests javac javadoc unit compile shadedjars
uname Linux 51b51c30f5f9 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision branch-2 / 6caae7a
Default Java Eclipse Adoptium-11.0.23+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6144/9/testReport/
Max. process+thread count 4391 (vs. ulimit of 30000)
modules C: hbase-client hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6144/9/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@droudnitsky
Copy link
Contributor Author

Going to open a new PR with clean history in attempt to be more review friendly

@droudnitsky droudnitsky closed this Jun 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants