Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YARN-8470. Fix a NPE in identifyContainersToPreemptOnNode() #416

Open
wants to merge 1 commit into
base: trunk
Choose a base branch
from

Conversation

gg7
Copy link

@gg7 gg7 commented Sep 11, 2018

I encountered this issue while running 3.1.0:

2018-09-10 13:42:39,437 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler: Container container_1536156801471_0071_01_000055 completed with event FINISHED, but corresponding RMContainer doesn't exist.
2018-09-10 13:42:39,881 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, FSPreemptionThread, that exited unexpectedly: java.lang.NullPointerException
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptOnNode(FSPreemptionThread.java:207)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptForOneContainer(FSPreemptionThread.java:161)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreempt(FSPreemptionThread.java:121)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.run(FSPreemptionThread.java:81)

2018-09-10 13:42:39,886 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down the resource manager.
2018-09-10 13:42:39,891 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1: a critical thread, FSPreemptionThread, that exited unexpectedly: java.lang.NullPointerException
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptOnNode(FSPreemptionThread.java:207)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptForOneContainer(FSPreemptionThread.java:161)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreempt(FSPreemptionThread.java:121)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.run(FSPreemptionThread.java:81)

I'm guessing a better fix would be to synchronise the removal of applications, but this simple patch should be an improvement IMO.

I encountered this issue while running 3.1.0:

```
2018-09-10 13:42:39,437 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler: Container container_1536156801471_0071_01_000055 completed with event FINISHED, but corresponding RMContainer doesn't exist.
2018-09-10 13:42:39,881 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, FSPreemptionThread, that exited unexpectedly: java.lang.NullPointerException
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptOnNode(FSPreemptionThread.java:207)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptForOneContainer(FSPreemptionThread.java:161)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreempt(FSPreemptionThread.java:121)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.run(FSPreemptionThread.java:81)

2018-09-10 13:42:39,886 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down the resource manager.
2018-09-10 13:42:39,891 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1: a critical thread, FSPreemptionThread, that exited unexpectedly: java.lang.NullPointerException
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptOnNode(FSPreemptionThread.java:207)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptForOneContainer(FSPreemptionThread.java:161)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreempt(FSPreemptionThread.java:121)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.run(FSPreemptionThread.java:81)
```

I'm guessing a better fix would be to synchronise the removal of applications,
but this simple patch should be an improvement IMO.

Signed-off-by: George G <git@gg7.io>
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 36 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
-1 test4tests 0 The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 mvninstall 1054 trunk passed
+1 compile 48 trunk passed
+1 checkstyle 36 trunk passed
+1 mvnsite 50 trunk passed
+1 shadedclient 754 branch has no errors when building and testing our client artifacts.
+1 javadoc 29 trunk passed
0 spotbugs 97 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 96 trunk passed
_ Patch Compile Tests _
+1 mvninstall 42 the patch passed
+1 compile 42 the patch passed
+1 javac 42 the patch passed
-0 checkstyle 27 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 2 new + 4 unchanged - 0 fixed = 6 total (was 4)
+1 mvnsite 45 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 701 patch has no errors when building and testing our client artifacts.
+1 javadoc 29 the patch passed
+1 findbugs 101 the patch passed
_ Other Tests _
-1 unit 4760 hadoop-yarn-server-resourcemanager in the patch failed.
+1 asflicense 25 The patch does not generate ASF License warnings.
7929
Reason Tests
Failed junit tests hadoop.yarn.server.resourcemanager.rmapp.TestApplicationLifetimeMonitor
Subsystem Report/Notes
Docker Client=18.09.8 Server=18.09.8 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-416/1/artifact/out/Dockerfile
GITHUB PR #416
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 63935efd06ab 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / cd967c7
Default Java 1.8.0_212
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-416/1/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-416/1/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-416/1/testReport/
Max. process+thread count 903 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-416/1/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 39 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
-1 test4tests 0 The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 mvninstall 1077 trunk passed
+1 compile 47 trunk passed
+1 checkstyle 30 trunk passed
+1 mvnsite 48 trunk passed
+1 shadedclient 692 branch has no errors when building and testing our client artifacts.
+1 javadoc 28 trunk passed
0 spotbugs 94 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 93 trunk passed
_ Patch Compile Tests _
+1 mvninstall 44 the patch passed
+1 compile 39 the patch passed
+1 javac 39 the patch passed
-0 checkstyle 30 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 2 new + 4 unchanged - 0 fixed = 6 total (was 4)
+1 mvnsite 45 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 709 patch has no errors when building and testing our client artifacts.
+1 javadoc 28 the patch passed
+1 findbugs 102 the patch passed
_ Other Tests _
-1 unit 4733 hadoop-yarn-server-resourcemanager in the patch failed.
+1 asflicense 29 The patch does not generate ASF License warnings.
7861
Reason Tests
Failed junit tests hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes
Subsystem Report/Notes
Docker Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-416/2/artifact/out/Dockerfile
GITHUB PR #416
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 62c6a5e3a6b5 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / c7c7a88
Default Java 1.8.0_212
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-416/2/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-416/2/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-416/2/testReport/
Max. process+thread count 891 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-416/2/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 36 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
-1 test4tests 0 The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 mvninstall 1275 trunk passed
+1 compile 49 trunk passed
+1 checkstyle 33 trunk passed
+1 mvnsite 53 trunk passed
+1 shadedclient 851 branch has no errors when building and testing our client artifacts.
+1 javadoc 30 trunk passed
0 spotbugs 103 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 101 trunk passed
_ Patch Compile Tests _
+1 mvninstall 47 the patch passed
+1 compile 45 the patch passed
+1 javac 45 the patch passed
-0 checkstyle 29 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 2 new + 4 unchanged - 0 fixed = 6 total (was 4)
+1 mvnsite 49 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 813 patch has no errors when building and testing our client artifacts.
+1 javadoc 30 the patch passed
+1 findbugs 117 the patch passed
_ Other Tests _
-1 unit 4932 hadoop-yarn-server-resourcemanager in the patch failed.
+1 asflicense 28 The patch does not generate ASF License warnings.
8570
Reason Tests
Failed junit tests hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
Subsystem Report/Notes
Docker Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-416/3/artifact/out/Dockerfile
GITHUB PR #416
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux c8a60409235b 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / c2d00c8
Default Java 1.8.0_212
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-416/3/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-416/3/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-416/3/testReport/
Max. process+thread count 915 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-416/3/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 47 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
-1 test4tests 0 The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 mvninstall 1325 trunk passed
+1 compile 47 trunk passed
+1 checkstyle 35 trunk passed
+1 mvnsite 51 trunk passed
+1 shadedclient 734 branch has no errors when building and testing our client artifacts.
+1 javadoc 29 trunk passed
0 spotbugs 104 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 102 trunk passed
_ Patch Compile Tests _
+1 mvninstall 46 the patch passed
+1 compile 42 the patch passed
+1 javac 42 the patch passed
-0 checkstyle 30 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 2 new + 4 unchanged - 0 fixed = 6 total (was 4)
+1 mvnsite 46 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 742 patch has no errors when building and testing our client artifacts.
+1 javadoc 32 the patch passed
+1 findbugs 122 the patch passed
_ Other Tests _
-1 unit 4945 hadoop-yarn-server-resourcemanager in the patch failed.
+1 asflicense 37 The patch does not generate ASF License warnings.
8461
Reason Tests
Failed junit tests hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesReservation
hadoop.yarn.server.resourcemanager.TestRMRestart
Subsystem Report/Notes
Docker Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-416/4/artifact/out/Dockerfile
GITHUB PR #416
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux a19e2a1cadcf 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 00b5a27
Default Java 1.8.0_212
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-416/4/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-416/4/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-416/4/testReport/
Max. process+thread count 5287 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-416/4/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 46 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
-1 test4tests 0 The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 mvninstall 1169 trunk passed
+1 compile 37 trunk passed
+1 checkstyle 31 trunk passed
+1 mvnsite 41 trunk passed
+1 shadedclient 771 branch has no errors when building and testing our client artifacts.
+1 javadoc 28 trunk passed
0 spotbugs 93 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 90 trunk passed
_ Patch Compile Tests _
+1 mvninstall 42 the patch passed
+1 compile 35 the patch passed
+1 javac 35 the patch passed
-0 checkstyle 25 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 2 new + 4 unchanged - 0 fixed = 6 total (was 4)
+1 mvnsite 36 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 689 patch has no errors when building and testing our client artifacts.
+1 javadoc 26 the patch passed
+1 findbugs 98 the patch passed
_ Other Tests _
-1 unit 4787 hadoop-yarn-server-resourcemanager in the patch failed.
+1 asflicense 24 The patch does not generate ASF License warnings.
8025
Reason Tests
Failed junit tests hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler
Subsystem Report/Notes
Docker Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-416/5/artifact/out/Dockerfile
GITHUB PR #416
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 54c4b342db83 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / e356e4f
Default Java 1.8.0_212
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-416/5/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-416/5/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-416/5/testReport/
Max. process+thread count 925 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-416/5/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 43 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
-1 test4tests 0 The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 mvninstall 1049 trunk passed
+1 compile 39 trunk passed
+1 checkstyle 29 trunk passed
+1 mvnsite 40 trunk passed
+1 shadedclient 676 branch has no errors when building and testing our client artifacts.
+1 javadoc 26 trunk passed
0 spotbugs 89 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 86 trunk passed
_ Patch Compile Tests _
+1 mvninstall 39 the patch passed
+1 compile 34 the patch passed
+1 javac 34 the patch passed
-0 checkstyle 25 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 2 new + 4 unchanged - 0 fixed = 6 total (was 4)
+1 mvnsite 38 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 686 patch has no errors when building and testing our client artifacts.
+1 javadoc 24 the patch passed
+1 findbugs 94 the patch passed
_ Other Tests _
-1 unit 4790 hadoop-yarn-server-resourcemanager in the patch failed.
+1 asflicense 25 The patch does not generate ASF License warnings.
7796
Reason Tests
Failed junit tests hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification
Subsystem Report/Notes
Docker Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-416/6/artifact/out/Dockerfile
GITHUB PR #416
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux fc3be7dcc82e 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 094d736
Default Java 1.8.0_222
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-416/6/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-416/6/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-416/6/testReport/
Max. process+thread count 871 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-416/6/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 78 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
-1 test4tests 0 The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 mvninstall 1215 trunk passed
+1 compile 42 trunk passed
+1 checkstyle 33 trunk passed
+1 mvnsite 45 trunk passed
+1 shadedclient 881 branch has no errors when building and testing our client artifacts.
+1 javadoc 29 trunk passed
0 spotbugs 112 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 110 trunk passed
_ Patch Compile Tests _
+1 mvninstall 55 the patch passed
+1 compile 46 the patch passed
+1 javac 46 the patch passed
-0 checkstyle 35 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 2 new + 4 unchanged - 0 fixed = 6 total (was 4)
+1 mvnsite 52 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 898 patch has no errors when building and testing our client artifacts.
+1 javadoc 38 the patch passed
+1 findbugs 118 the patch passed
_ Other Tests _
-1 unit 6116 hadoop-yarn-server-resourcemanager in the patch failed.
+1 asflicense 28 The patch does not generate ASF License warnings.
9892
Subsystem Report/Notes
Docker Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-416/7/artifact/out/Dockerfile
GITHUB PR #416
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux e48271684d61 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 69ddb36
Default Java 1.8.0_222
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-416/7/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-416/7/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-416/7/testReport/
Max. process+thread count 889 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-416/7/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 74 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
-1 test4tests 0 The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 mvninstall 1253 trunk passed
+1 compile 42 trunk passed
+1 checkstyle 32 trunk passed
+1 mvnsite 46 trunk passed
+1 shadedclient 840 branch has no errors when building and testing our client artifacts.
+1 javadoc 29 trunk passed
0 spotbugs 100 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 97 trunk passed
_ Patch Compile Tests _
+1 mvninstall 45 the patch passed
+1 compile 39 the patch passed
+1 javac 39 the patch passed
-0 checkstyle 29 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 2 new + 4 unchanged - 0 fixed = 6 total (was 4)
+1 mvnsite 44 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 870 patch has no errors when building and testing our client artifacts.
+1 javadoc 30 the patch passed
+1 findbugs 100 the patch passed
_ Other Tests _
-1 unit 5280 hadoop-yarn-server-resourcemanager in the patch failed.
+1 asflicense 30 The patch does not generate ASF License warnings.
8937
Reason Tests
Failed junit tests hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption
hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy
Subsystem Report/Notes
Docker Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-416/8/artifact/out/Dockerfile
GITHUB PR #416
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 3d2845e96db4 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 3329257
Default Java 1.8.0_222
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-416/8/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-416/8/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-416/8/testReport/
Max. process+thread count 811 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-416/8/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 43 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
-1 test4tests 0 The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 mvninstall 1125 trunk passed
+1 compile 47 trunk passed
+1 checkstyle 40 trunk passed
+1 mvnsite 55 trunk passed
+1 shadedclient 761 branch has no errors when building and testing our client artifacts.
+1 javadoc 34 trunk passed
0 spotbugs 101 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 99 trunk passed
_ Patch Compile Tests _
+1 mvninstall 47 the patch passed
+1 compile 41 the patch passed
+1 javac 41 the patch passed
-0 checkstyle 32 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 2 new + 4 unchanged - 0 fixed = 6 total (was 4)
+1 mvnsite 46 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 787 patch has no errors when building and testing our client artifacts.
+1 javadoc 30 the patch passed
+1 findbugs 106 the patch passed
_ Other Tests _
-1 unit 4939 hadoop-yarn-server-resourcemanager in the patch failed.
+1 asflicense 29 The patch does not generate ASF License warnings.
8319
Reason Tests
Failed junit tests hadoop.yarn.server.resourcemanager.TestRMAdminService
Subsystem Report/Notes
Docker Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-416/9/artifact/out/Dockerfile
GITHUB PR #416
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 59f04492dc2a 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 915cbc9
Default Java 1.8.0_222
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-416/9/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-416/9/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-416/9/testReport/
Max. process+thread count 843 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-416/9/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

if (app == null) {
// e.g. "INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler: Container container_1536156801471_0071_01_000096 completed with event FINISHED, but corresponding RMContainer doesn't exist."
LOG.warn("app == null, giving up in identifyContainersToPreemptOnNode()");
return null;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just continue instead of returning null since we might still be able to find preemptable containers on this node?

shanthoosh added a commit to shanthoosh/hadoop that referenced this pull request Oct 15, 2019
When zookeeper session failures occur in a stream processor,   leaves the group(zkClient is closed) and joins the group again.

The last step in that shutdown sequence is zkClient.close(). In some scenarios, it throws the following exception,

    org.I0Itec.zkclient.exception.ZkInterruptedException: java.lang.InterruptedException
    at org.I0Itec.zkclient.ZkClient.close(ZkClient.java:1278)
    at org.apache.samza.zk.ZkControllerImpl.stop(ZkControllerImpl.java:92)

    at org.apache.samza.zk.ZkJobCoordinator.stop(ZkJobCoordinator.java:141)
In existing implementation this is not handled, there by killing the stream processor.  The following codepath triggers this exception:

`StreamProcessor.stop -> ZkJobCoordinator.stop() ->  zkController.stop() -> zkUtils.close`

This exception causes the integration test to fail occasionally  and can cause LocalApplicationRunner.waitForFinish method call to block indefinitely(since this callback event success, updates the latch state required for waitForFinish to end).

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes apache#416 from shanthoosh/zk_utils_close
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants