Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-18329 - Support for IBM Semeru JVM v>11.0.15.0 Vendor Name Changes #4537

Merged
merged 15 commits into from
Dec 10, 2022
Merged

HADOOP-18329 - Support for IBM Semeru JVM v>11.0.15.0 Vendor Name Changes #4537

merged 15 commits into from
Dec 10, 2022

Conversation

JackBuggins
Copy link
Contributor

Description of PR

There are checks within the PlatformName class that use the Vendor property of the provided runtime JVM specifically looking for IBM within the name. Whilst this check worked for IBM's java technology edition it fails to work on Semeru since 11.0.15.0 due to the following change:

java.vendor system property
In this release, the java.vendor system property has been changed from "International Business Machines Corporation" to "IBM Corporation".

Modules such as the below are not provided in these runtimes.
com.ibm.security.auth.module.JAASLoginModule

This change attempts to use reflection to ensure that a class common to IBM JT runtimes exists, extending upon the vendor check, since IBM vendored JVM's may not actually require special logic to use custom security modules. The same 3.3.3 versions were working correctly until the vendor name change was observed during routine upgrades by internal CI.

How was this patch tested?

CI + Unit test, some seemingly unrelated failures were observed relating to java.lang.NoSuchMethodError: java.nio.ByteBuffer.limit(I)Ljava/nio/ByteBuffer;

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think just staying with IBM_JAVA simplifies this change. we aren't going to remove the deprecated field in case it is used externally.

@JackBuggins
Copy link
Contributor Author

Thank you for taking the time to review this change so far, I have implemented your suggestions.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 from me; i trust you to have done the testing.

i made a comment about the spelling in a comment, please could you add that just to keep the US-spelling developers happy. thx.

once that is in, I will merge here and to branch-3.3, which will be releasing an update this year. testing there would be wonderful

@steveloughran
Copy link
Contributor

aah, one of the runs now finds a loop in references. now I understand the duplicate code.

Can you fix that by restoring the code, removing the pom changes, *and add a comment to the source saying "duplicated to avoid cycles in the build"

@JackBuggins JackBuggins marked this pull request as draft July 12, 2022 23:03
@JackBuggins
Copy link
Contributor Author

@steveloughran, I will conduct some more testing from a spark perspective and report back once I'm fully confident, since that is where I initially observed the failures.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 48s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 37s Maven dependency ordering for branch
+1 💚 mvninstall 29m 54s trunk passed
+1 💚 compile 29m 16s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 compile 25m 32s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 40s trunk passed
+1 💚 mvnsite 5m 7s trunk passed
+1 💚 javadoc 4m 20s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 3m 44s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 6m 47s trunk passed
+1 💚 shadedclient 24m 38s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 55s Maven dependency ordering for patch
+1 💚 mvninstall 2m 17s the patch passed
+1 💚 compile 28m 56s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javac 28m 56s the patch passed
+1 💚 compile 25m 44s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 25m 44s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 1m 32s /results-checkstyle-hadoop-common-project.txt hadoop-common-project: The patch generated 1 new + 311 unchanged - 0 fixed = 312 total (was 311)
+1 💚 mvnsite 5m 2s the patch passed
+1 💚 javadoc 4m 16s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 3m 46s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 7m 26s the patch passed
+1 💚 shadedclient 24m 28s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 0s hadoop-minikdc in the patch passed.
+1 💚 unit 3m 43s hadoop-auth in the patch passed.
+1 💚 unit 18m 58s hadoop-common in the patch passed.
+1 💚 unit 1m 41s hadoop-registry in the patch passed.
+1 💚 asflicense 1m 16s The patch does not generate ASF License warnings.
281m 46s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4537/4/artifact/out/Dockerfile
GITHUB PR #4537
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 3e970442dc6a 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 921f922
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4537/4/testReport/
Max. process+thread count 2845 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-minikdc hadoop-common-project/hadoop-auth hadoop-common-project/hadoop-common hadoop-common-project/hadoop-registry U: hadoop-common-project
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4537/4/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 52s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 25s Maven dependency ordering for branch
+1 💚 mvninstall 29m 43s trunk passed
+1 💚 compile 30m 34s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 compile 25m 19s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 45s trunk passed
+1 💚 mvnsite 5m 4s trunk passed
+1 💚 javadoc 4m 22s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 3m 43s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 6m 36s trunk passed
+1 💚 shadedclient 24m 14s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 54s Maven dependency ordering for patch
+1 💚 mvninstall 2m 43s the patch passed
+1 💚 compile 29m 46s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javac 29m 46s the patch passed
+1 💚 compile 25m 7s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 25m 7s the patch passed
+1 💚 blanks 0m 1s The patch has no blanks issues.
-0 ⚠️ checkstyle 1m 39s /results-checkstyle-hadoop-common-project.txt hadoop-common-project: The patch generated 1 new + 311 unchanged - 0 fixed = 312 total (was 311)
+1 💚 mvnsite 5m 0s the patch passed
+1 💚 javadoc 4m 15s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 3m 48s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 7m 13s the patch passed
+1 💚 shadedclient 24m 26s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 2s hadoop-minikdc in the patch passed.
+1 💚 unit 3m 43s hadoop-auth in the patch passed.
+1 💚 unit 19m 4s hadoop-common in the patch passed.
+1 💚 unit 1m 39s hadoop-registry in the patch passed.
+1 💚 asflicense 1m 16s The patch does not generate ASF License warnings.
282m 55s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4537/5/artifact/out/Dockerfile
GITHUB PR #4537
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 99299b4a60c0 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / c2ff761
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4537/5/testReport/
Max. process+thread count 2600 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-minikdc hadoop-common-project/hadoop-auth hadoop-common-project/hadoop-common hadoop-common-project/hadoop-registry U: hadoop-common-project
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4537/5/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@JackBuggins
Copy link
Contributor Author

relates to eclipse-openj9/openj9#14950

@apache apache deleted a comment from hadoop-yetus Jul 14, 2022
@apache apache deleted a comment from hadoop-yetus Jul 14, 2022
@apache apache deleted a comment from hadoop-yetus Jul 14, 2022
@steveloughran
Copy link
Contributor

one checkstyle
./hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/util/PlatformName.java:55: hasClass("com.ibm.security.auth.module.JAASLoginModule");: 'hasClass' has incorrect indentation level 4, expected level should be 6. [Indentation]

try {
Thread.currentThread().getContextClassLoader().loadClass(className);
return true;
} catch(ClassNotFoundException ignored) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can any other exception get raised here? if so, best to log and downgrade to false too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @steveloughran - Would you suggest we are catching the generic Exception instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

kwart added a commit to hazelcast/hazelcast that referenced this pull request Oct 25, 2022
…22588)

Fixes #20754

Hadoop in the current version wrongly checks which login module should
be used on some IBM Java versions. We should skip problematic
configurations until Hadoop has the fix released.

See also apache/hadoop#4537
kwart added a commit to hazelcast/hazelcast that referenced this pull request Oct 25, 2022
…22588)

Fixes #20754

Hadoop in the current version wrongly checks which login module should
be used on some IBM Java versions. We should skip problematic
configurations until Hadoop has the fix released.

See also apache/hadoop#4537
kwart added a commit to hazelcast/hazelcast that referenced this pull request Oct 25, 2022
…22588)

Fixes #20754

Hadoop in the current version wrongly checks which login module should
be used on some IBM Java versions. We should skip problematic
configurations until Hadoop has the fix released.

See also apache/hadoop#4537
@microeastcowboy
Copy link

microeastcowboy commented Dec 6, 2022

Has there been any movement on this pr?

@JackBuggins
Copy link
Contributor Author

Has there been any movement on this pr?

I'll try and carve out a few hours this week 👍 some slight enhancement I think could be made in the class loader method I originally came up with (I want to test my concerns).

@JackBuggins
Copy link
Contributor Author

This should be a bit more robust to extension as well as handle the concerns I had about the class loader from before. Is there any consensus/ruling around adding a test against the IBM JREs? I appreciate it would take a bit of time on CI and this is a once in a blue moon activity, but it could be a single suite of integration tests against auth that execute to verify the result against latest semeru is not IBM, and vice versa.

@steveloughran
Copy link
Contributor

@JackBuggins merged to trunk, but it has merge conflicts into 3.3 with SSLFactory.java. can you do a quick review and PR there? I'm busy testing the abfs blockers

hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ssl/SSLFactory.java

      "ssl.server.exclude.cipher.list";

<<<<<<< HEAD
  public static final String SSLCERTIFICATE = IBM_JAVA?"ibmX509":"SunX509";
=======
  public static final String KEY_MANAGER_SSLCERTIFICATE =
      IBM_JAVA ? "ibmX509" :
          KeyManagerFactory.getDefaultAlgorithm();

  public static final String TRUST_MANAGER_SSLCERTIFICATE =
      IBM_JAVA ? "ibmX509" :
          TrustManagerFactory.getDefaultAlgorithm();
>>>>>>> a46b20d25f1 (HADOOP-18329. Support for IBM Semeru JVM > 11.0.15.0 Vendor Name Changes (#4537))

@JackBuggins
Copy link
Contributor Author

@steveloughran I've popped up a PR against branch 3.3; should I do the same for 3.3.5?

@steveloughran
Copy link
Contributor

get it into 3.3 and i will pull to 3.3.5, they are almost identical. if there are merge problems again, then we can worry about it

steveloughran pushed a commit that referenced this pull request Dec 12, 2022
…ges (#4537) (#5208)

The static boolean PlatformName.IBM_JAVA now identifies
Java 11+ IBM Semeru runtimes as IBM JVM releases.

Contributed by Jack Buggins.
asfgit pushed a commit that referenced this pull request Dec 12, 2022
…ges (#4537) (#5208)

The static boolean PlatformName.IBM_JAVA now identifies
Java 11+ IBM Semeru runtimes as IBM JVM releases.

Contributed by Jack Buggins.
slfan1989 pushed a commit to slfan1989/hadoop that referenced this pull request Dec 20, 2022
…ges (apache#4537)


The static boolean PlatformName.IBM_JAVA now identifies
Java 11+ IBM Semeru runtimes as IBM JVM releases.

Contributed by Jack Buggins.
mdumandag pushed a commit to mdumandag/hazelcast that referenced this pull request Dec 23, 2022
…azelcast#22588)

Fixes hazelcast#20754

Hadoop in the current version wrongly checks which login module should
be used on some IBM Java versions. We should skip problematic
configurations until Hadoop has the fix released.

See also apache/hadoop#4537
@ivrisivris
Copy link

Hi,

thank you for solving this problem. Is there any plan when 3.3.5 will be released?

Thanks fo the reply

Best regards

Kamil

@JackBuggins
Copy link
Contributor Author

@ivrisivris Sorry, I don't know the answer to that one, but there is a workaround you might consider in the meantime which makes use of an agent to manipulate the vendor name - you can see an example here

@ivrisivris
Copy link

@JackBuggins Thank you for responding so quickly.

@steveloughran
Copy link
Contributor

The 3.3.5 RC0 is up for testing. Grab it from the Hadoop web site and make sure it works for you. Do not wait until the final release-as if that is broken or will still take a while for a patched release to ship.

@ivrisivris
Copy link

Hi,
I tried hadoop-3.3.5 RC0 but still got errors. I thought it was a #4537 issue and it was fixed in RC0.

Errors
Caused by: javax.security.auth.login.LoginException: No LoginModule found for com.ibm.security.auth.module.LinuxLoginModule at java.base/javax.security.auth.login.LoginContext.invoke(LoginContext.java:731) at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:672) at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:670) at java.base/java.security.AccessController.doPrivileged(AccessController.java:783) at java.base/javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:670) at java.base/javax.security.auth.login.LoginContext.login(LoginContext.java:581) at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:812)

Enviroment
openjdk version "11.0.17" 2022-10-18
IBM Semeru Runtime Open Edition 11.0.17.0 (build 11.0.17+8)
Eclipse OpenJ9 VM 11.0.17.0 (build openj9-0.35.0, JRE 11 Linux amd64-64-Bit Compressed References 20221031_559 (JIT enabled, AOT enabled)
OpenJ9 - e04a7f6c1
OMR - 85a21674f
JCL - a94c231303 based on jdk-11.0.17+8)

Thank you for your help

Kamil

@JackBuggins
Copy link
Contributor Author

I'll check this out against 11.0.17 today

@steveloughran
Copy link
Contributor

jack, be good to know. we are going to do a new RC next week and this is the kind of stabilisation issue we can address

@JackBuggins
Copy link
Contributor Author

I've just span up a version of rc-0 3.3.5 against semeru 11.0.17 from and executed a few of the sample jobs and checked out the dev/null out. eg.

yarn jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.5.jar teragen 100 /test 
yarn jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.5.jar terasort /test /testout

I'm not seeing anything suspicious in the stderr - @ivrisivris can you give me some more detail on the specific cluster config / actions / workloads that triggered this so I can reproduce and diagnose a bit better? Are you using this with Spark or anything else? Are these being added to the classpath in some other way?

I currently don't believe anything packaged IBM semeru 11.0.17 or hadoop rc-0 3.3.5 contains something to trigger the check added, (I've even ran a jar with just that class to check that, and this looks good). Whilst I wait on a response on the above I'll check that there isn't anywhere else that is not using the IBM_JAVA check to determine the use of those modules and report back shortly.

@JackBuggins
Copy link
Contributor Author

From what I can see, the below could still exhibit the same behaviour @steveloughran - I'll see if I can figure out how to get into these paths to validate it so I understand the scenario @ivrisivris is likely hitting.

@JackBuggins
Copy link
Contributor Author

JackBuggins commented Jan 18, 2023

I hit a similar error trivially in spark 3.4.0/dev

When actually using the RC0 against spark 3.4.0 I can't reproduce this either.

From what I can see the only way you can get into the path of the stack trace above is by including one of the com.ibm.security classes at runtime, (two cases above which might need some work should land us elsewhere) and apart from testing against some more base OS, or understanding how this is being used I'm stuck here unfortunately.

This may present a good case to allow a config opt to the environment/config files that overrides the default logic being used here as well as debug logging.
@steveloughran if you're happy I can go ahead and prep that. Rationale is that Hadoop is used in so many ways it's possible these classes could get here in other ways, or otherwise exist in another codebase providing the Hadoop packages, although my preference is get the details to reproduce and add tests for this case.

@ivrisivris if you can give any more details like specific base OS, perhaps provide some basic app with the same dependencies that I can inspect I'm happy to investigate further and try to make sure it can be resolved for you. Thank you!

@steveloughran
Copy link
Contributor

maybe start with logging that java.vendor sysprop at debug.

reopened the JIRA

@yuyanlei-8130
Copy link
Contributor

hi,JackBuggins,I am using in the merged hadoop3.3.4 your #5208 code, start the datanode error, an error is as follows:
STARTUP_MSG: java = 11.0.15 ************************************************************/ INFO org.apache.hadoop.hdfs.server.datanode.DataNode: registered UNIX signal handlers for [TERM, HUP, INT] ERROR org.apache.hadoop.security.UserGroupInformation: Unable to find JAAS classes:com.ibm.security.auth.UsernamePrincipal ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain org.apache.hadoop.security.KerberosAuthException: failure to login: for principal: work/***@HADOOP.COM from keytab /etc/security/keytab/.hdfs.keytab javax.security.auth.login.LoginException: No LoginModule found for com.ibm.security.auth.module.Krb5LoginModule at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1986) at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytabAndReturnUGI(UserGroupInformation.java:1361) at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:1122) at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:315) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2732) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2778) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2922) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2946) Caused by: javax.security.auth.login.LoginException: No LoginModule found for com.ibm.security.auth.module.Krb5LoginModule at java.base/javax.security.auth.login.LoginContext.invoke(LoginContext.java:731) at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:672) at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:670) at java.base/java.security.AccessController.doPrivileged(AccessController.java:783) at java.base/javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:670) at java.base/javax.security.auth.login.LoginContext.login(LoginContext.java:581) at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:2065) at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1975) ... 7 more
java -version:
openjdk version "11.0.15" 2022-04-19 IBM Semeru Runtime Open Edition 11.0.15.0 (build 11.0.15+10) Eclipse OpenJ9 VM 11.0.15.0 (build openj9-0.32.0, JRE 11 Linux amd64-64-Bit Compressed References 20220422_425 (JIT enabled, AOT enabled) OpenJ9 - 9a84ec34e OMR - ab24b6666 JCL - b7b5b42ea6 based on jdk-11.0.15+10)
Do I need to use a later version of the jdk or is this an unsolved bug

@JackBuggins
Copy link
Contributor Author

@Tre2878 did you build the branch it was previously merged to from source?

It's merged back retrospectively but once published initially there aren't more updates to a stable stream in terms of distributions. Ie. If you aren't switching versions and aren't building you aren't getting any back ported fixes.

Please us 3.3.5 RC or build the branch yourself to pick up the changes 👍🏼

@JackBuggins
Copy link
Contributor Author

If you are finding some more areas where IBM classes are being called relating to auth, if you can demonstrate a configuration with specific details as well as how you reproduce it I will take a look. I need the JRE details, OS details and full hadoop config and command executed to hit this. If this is specifically relating to Kerberos, please add details for that too but obviously I won't need any secrets.

I've spent some good hours trying to replicate it so far so having this detail will be awesome for me to understand where it's failing for some. May be best to pop this on the jira only. Thanks!

@JackBuggins
Copy link
Contributor Author

@Tre2878 - any of the files I listed here look sus to you #4537 (comment)?

@steveloughran
Copy link
Contributor

ok, i think we can hopefully say this is a cannot reproduce state, especially if @Tre2878 is using their own build.

  1. can you do the kdiag command to get the diags, after sanitising anything you don't want to share, attach to the jira
  2. then grab my cloudstore.jar and run its storediag command with an hdfs url to see how interaction with hdfs goes: https://github.com/steveloughran/cloudstore

@JackBuggins
Copy link
Contributor Author

JackBuggins commented Feb 8, 2023

Yeah, I just need a way I can reliably hit the errors to understand a bit better what isn't covered so far to proceed further. This indeed looks similar to the previous issue reported, so it would seem to be a common use case. Thanks for sharing the process.

@yuyanlei-8130
Copy link
Contributor

yuyanlei-8130 commented Feb 9, 2023

@Tre2878 did you build the branch it was previously merged to from source?

It's merged back retrospectively but once published initially there aren't more updates to a stable stream in terms of distributions. Ie. If you aren't switching versions and aren't building you aren't getting any back ported fixes.

Please us 3.3.5 RC or build the branch yourself to pick up the changes 👍🏼

I am in the branch: https://github.com/apache/hadoop/tree/branch-3.3.4 to merge your #5208 code, compile, deploy start an error, My understanding is that there is no patch for 3.3.4 yet, right?

@JackBuggins
Copy link
Contributor Author

@Tre2878 - That's correct, it's just branch-3.3 and trunk, which is in turn in the 3.3.5 stream, so right now upgrading/testing out the 3.3.5 stream is the way to get these changes. Looks like the 3.3.4 distro was cut around August 2022 (based on git releases), and the changes go to branch-3.3 during December 2022.

I would need to defer to project owners if porting to branch 3.3.4 for others to pickup when building from source would be an option here. You could try cherry-picking the commit against 3.3 to branch 3.3.4 then build your own copy to test, or otherwise consider applying some type of agent workaround to modify the vendor name until you're in a position to do this.

@steveloughran
Copy link
Contributor

My understanding is that there is no patch for 3.3.4 yet, right?

3.3.5 is the successor to 3.3.4 from the ASF, therefore there is a patch for 3.3.4 and it is called "upgrade to 3.3.5". you can take our 3.3.4 release, fork it and make whatever changes you want, but you will have a private release at that point.

lbulej added a commit to renaissance-benchmarks/renaissance that referenced this pull request Oct 20, 2023
This is to include a fix for IBM Semeru builds of OpenJ9 based JVMs
which as included in version 3.3.5.

Details at apache/hadoop#4537
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants