Skip to content

HDFS-17054. Erasure coding: optimize checkReplicaOnStorage method to avoid regarding all replicas on one datanode as corrupt repeatly. #5760

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: trunk
Choose a base branch
from

Conversation

hfutatzhanghb
Copy link
Contributor

@hfutatzhanghb hfutatzhanghb commented Jun 19, 2023

Description of PR

Please see HDFS-17054

How to test

TestBlockManager.java
TestDecommissionWithStriped.java
TestNodeCount.java
TestReconstructStripedBlocks.java
TestRedudantBlocks.java
TestOverReplicatedBlocks.java
TestProcessCorruptBlocks.java

…avoid regarding all replicas on one datanode as corrupt repeatly.
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 57s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 39m 9s trunk passed
+1 💚 compile 1m 23s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 1m 13s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 1m 15s trunk passed
+1 💚 mvnsite 1m 24s trunk passed
+1 💚 javadoc 1m 15s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 38s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 3m 34s trunk passed
+1 💚 shadedclient 27m 5s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 12s the patch passed
+1 💚 compile 1m 17s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 1m 17s the patch passed
+1 💚 compile 1m 8s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 javac 1m 8s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 1s the patch passed
+1 💚 mvnsite 1m 14s the patch passed
+1 💚 javadoc 0m 57s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 28s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 3m 21s the patch passed
+1 💚 shadedclient 26m 41s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 243m 42s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 47s The patch does not generate ASF License warnings.
361m 6s
Reason Tests
Failed junit tests hadoop.hdfs.server.datanode.TestDirectoryScanner
hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes
hadoop.hdfs.server.namenode.ha.TestObserverNode
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5760/3/artifact/out/Dockerfile
GITHUB PR #5760
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux d7088be04a4a 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 664454c
Default Java Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5760/3/testReport/
Max. process+thread count 2511 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5760/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 18m 32s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 39m 23s trunk passed
+1 💚 compile 1m 24s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 1m 14s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 1m 16s trunk passed
+1 💚 mvnsite 1m 22s trunk passed
+1 💚 javadoc 1m 14s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 36s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 3m 32s trunk passed
+1 💚 shadedclient 26m 43s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 12s the patch passed
+1 💚 compile 1m 16s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 1m 16s the patch passed
+1 💚 compile 1m 8s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 javac 1m 8s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 2s the patch passed
+1 💚 mvnsite 1m 14s the patch passed
+1 💚 javadoc 0m 58s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 27s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 3m 20s the patch passed
+1 💚 shadedclient 26m 40s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 239m 46s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 47s The patch does not generate ASF License warnings.
374m 42s
Reason Tests
Failed junit tests hadoop.hdfs.server.datanode.TestDirectoryScanner
hadoop.hdfs.server.namenode.ha.TestObserverNode
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5760/2/artifact/out/Dockerfile
GITHUB PR #5760
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux f5e8c8a28c58 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 664454c
Default Java Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5760/2/testReport/
Max. process+thread count 2145 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5760/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hfutatzhanghb
Copy link
Contributor Author

The failed unit tests were all passed in my local.

if (nodesCorrupt != null && nodesCorrupt.contains(node) &&
(haveComputedAsCorrupted == null || !haveComputedAsCorrupted.contains(node))) {
if (haveComputedAsCorrupted != null) {
haveComputedAsCorrupted.add(node);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand your code correctly, if the same block group has two internal blocks on the same datanode, then you will only calculate one. IMO, the current implementation of CorruptReplicasMap does not record which specific internal block on the datanode was corrupt, how could you confirm that there is only one internal block corrupt?

@zhangshuyan0
Copy link
Contributor

This PR may need to add a new UT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants