Skip to content

Update to Hadoop 3.3.5 #44

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 15, 2023
Merged

Update to Hadoop 3.3.5 #44

merged 4 commits into from
Jun 15, 2023

Conversation

oneonestar
Copy link
Member

@oneonestar oneonestar commented Aug 31, 2022

supersede #37

Changes

src/main/java/io/trino/hadoop/SocksSocketFactory.java

src/main/java/org/apache/hadoop/crypto/key/kms/KMSClientProvider.java

  • Apply HADOOP-17288: Use shaded guava from org.apache.hadoop.thirdparty.

src/main/java/org/apache/hadoop/fs/FileSystem.java

  • HADOOP-17313: FileSystem.Cache became a final class. A semaphore has being added to getInternal().
  • Looks like it is safe to remove the final keyword

src/main/java/org/apache/hadoop/fs/ForwardingFileSystemCache.java

src/main/java/org/apache/hadoop/util/LineReader.java

io/trino/hadoop/TestHadoopNative.java

  • See HADOOP-17125.

Removed files

src/main/java/org/apache/hadoop/security/authentication/util/KerberosUtil.java

  • The existing version is 3.2.0 + HADOOP-10848 + HADOOP-17432
  • Both patches are included in 3.3.1

src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java

  • HADOOP-16479 is included in 3.3.1

src/main/java/org/wildfly/openssl/OpenSSLProvider.java

  • Changes in response to HADOOP-16371

pom.xml

Update slf4j to 1.7.36 in order to align with hadoop
https://github.com/apache/hadoop/blob/a585a73c3e02ac62350c136643a5e7f6095a3dbb/hadoop-project/pom.xml#L81

Add jsr305 for javax.annotation for JDK > 9

Add a dependency to org.lz4:lz4-java

HADOOP-17292: The lz4-java is declared in provided scope. Applications that wish to use lz4 codec must declare dependency on lz4-java explicitly.

The version is 1.7.1 in order to align with hadoop-project.
https://github.com/apache/hadoop/blob/a585a73c3e02ac62350c136643a5e7f6095a3dbb/hadoop-project/pom.xml#L146

Add a provided dependency to org.xerial.snappy:snappy-java.
This is caused by HADOOP-17125.
Since snappy-java is already a dependency for Trino, scope=provided is enough for the test to pass.
The version is 1.1.8.2 in order to align with hadoop-project.
https://github.com/apache/hadoop/blob/a585a73c3e02ac62350c136643a5e7f6095a3dbb/hadoop-project/pom.xml#L145

Add relocations for the new dependencies come from hadoop

TODO

  • Update the binaries inside src/main/resources/nativelib
  • Test with different environments

@cla-bot cla-bot bot added the cla-signed label Aug 31, 2022
@oneonestar oneonestar force-pushed the hadoop_3.3.4 branch 6 times, most recently from 676fd46 to cd71fb8 Compare August 31, 2022 08:48
@oneonestar
Copy link
Member Author

@zielmicha I'll be appreciated if you could help on testing this PR.
I just bumped the version from 3.3.1 to 3.3.4. I'm going to test the patch with Hadoop 2.7 and 3.3.
AFAIK Hadoop doesn't provide any explicit promise on compatibility across major versions.
Any additional tests would help a lot.

Also, it looks like all the binaries need to built by a project maintainer (#27 (comment)). We'll need some help from @electrum
I would also like to know what's the current status of CI tests coverage on different versions of Hadoop.

@zielmicha
Copy link
Member

I made some small fixes to make main Trino repo compile here:
https://github.com/zielmicha/trino-hadoop-apache/commits/hadoop_3.3.4

I tried to test it using test Trino/Hive cluster and I'm getting weird errors - NoClassDefFoundError for UserGroupInformation$HadoopConfiguration, even though this file exists in jar and other classes (e.g. UserGroupInformation) load fine. Any ideas what might be going wrong?

@oneonestar
Copy link
Member Author

javax.annotation

zielmicha@4ff5ab6#diff-9c5fb3d1b7e3b0f54bc5c4182965c4fe1f9023d449017cece3005d3f90e8e4d8R597-R600
javax.annotation only contains annotations. I'm not sure is it necessary to relocate it.

okhttp

https://mvnrepository.com/artifact/com.squareup.okhttp/okhttp
com.squareup.okhttp has been renamed to com.squareup.okhttp3
I make a update in bf4a04d

kotlin-stdlib

[INFO] +- org.apache.hadoop:hadoop-mapreduce-client-core:jar:3.3.4:runtime
[INFO] |  \- org.apache.hadoop:hadoop-hdfs-client:jar:3.3.4:runtime
[INFO] |     +- com.squareup.okhttp3:okhttp:jar:4.9.3:runtime
[INFO] |     |  \- com.squareup.okio:okio:jar:2.8.0:runtime
[INFO] |     +- org.jetbrains.kotlin:kotlin-stdlib:jar:1.4.10:runtime
[INFO] |     \- org.jetbrains.kotlin:kotlin-stdlib-common:jar:1.4.10:runtime

Somehow Kotlin has been included in hadoop-hdfs-client. It looks like an unintentional change to me: apache/hadoop#4229 (comment)
Other projects are also being affected: https://github.com/apache/hbase/pull/4687/files
Hopefully they will fix the dependency soon: https://issues.apache.org/jira/browse/HDFS-16714
I shaded the Kotlin related lib in bf4a04d

KerberosUtil

KerberosUtil has been removed since the patches are already included in 3.3.1
I forgot to remove the exclusion in pom.xml which caused a runtime error.
This has been fixed in 0883143.


@zielmicha I have successfully make queries on Hadoop 3.3.x cluster.
I can't reproduce your error. Could you try the new changes and see if it works for you?

@zielmicha
Copy link
Member

javax.annotation:
Did you build the main Trino repo with the new dependency or just replaced the jars? I think not shading works fine at runtime, but Maven complains about duplicate classes.

@oneonestar
Copy link
Member Author

Ok. I added the relocation rule for javax.annotation.

I removed LdapGroupsMapping.java because #32 should already be fixed by
apache/hadoop@f257497

@zielmicha
Copy link
Member

zielmicha commented Sep 2, 2022

After the last round of changes, I no longer see any errors at runtime. My guess is this problem was related to KerberosUtil (UserGroupInformation$HadoopConfiguration uses it), the error message was just super unclear.

I'll do few more manual tests (note the configuration I have does not use HDFS, so I don't really test "against" Hadoop, just confirming that the libraries work okay internally). I'll also run the regression tests suite.

The okhttp3 rule (<pattern>okhttp3</pattern>) I added is also needed to build the main repo.

@zielmicha
Copy link
Member

I've made another small change to pom.xml: zielmicha@ca49d40

The regression tests in the main repo now pass.

@oneonestar
Copy link
Member Author

$ jar tf target/hadoop-apache-3.3.4-1-SNAPSHOT.jar | grep okhttp3
okhttp3/
okhttp3/Request$Builder.class
okhttp3/Dispatcher.class
okhttp3/Headers$Companion.class
okhttp3/MultipartBody$Part.class
okhttp3/Protocol.class
...

You are right. <pattern>okhttp3</pattern> is the correct way.

@zielmicha
Copy link
Member

I tested the current version on a small test cluster with Hive connector (using NFS for storage, not HDFS). Not sure what other testing we should do, maybe it makes sense to request review from maintainers now?

@zielmicha
Copy link
Member

@oneonestar are you okay with requesting review from maintainers for your PR now?

@oneonestar
Copy link
Member Author

Sure. Let's go for it.

@zielmicha
Copy link
Member

cc @electrum

@KarlManong
Copy link

KarlManong commented Oct 18, 2022

@oneonestar @zielmicha version 400 io.trino.rcfile.TestRcFileReader failed "java.lang.NoClassDefFoundError: org/xerial/snappy/Snappy".
But I think it's ok.
trino-record-decoder or trino-parquet include <dependency> <groupId>org.xerial.snappy</groupId> <artifactId>snappy-java</artifactId> <scope>runtime</scope> </dependency>.

trino-rcfile should do the same

@oneonestar
Copy link
Member Author

oneonestar commented Oct 18, 2022

<artifact>org.apache.hadoop:hadoop-auth</artifact>
<excludes>
<exclude>org/apache/hadoop/security/authentication/util/KerberosUtil.class</exclude>
<exclude>org/apache/hadoop/security/authentication/util/KerberosUtil$*.class</exclude>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the subsequent filter for org.apache.hadoop:hadoop-azure needs update. I got a class not found when running TestDeltaLakeAdlsConnectorSmokeTest locally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the filter for org.apache.hadoop:hadoop-azure.
However, I can't run TestDeltaLakeAdlsConnectorSmokeTest because I don't have access to azure-abfs.

https://github.com/trinodb/trino/blob/master/plugin/trino-delta-lake/src/test/java/io/trino/plugin/deltalake/TestDeltaLakeAdlsConnectorSmokeTest.java#L64-L66

@mladjan-gadzic
Copy link

@oneonestar hello! Is there any motion on this PR? There are erros HadoopIllegalArgumentException: Invalid buffer, not of length X" when querying Hive erasure coding tables when using Trino with Apache Hadoop version 3.2.0. Error mentioned before is fixed in Apache Hadoop version 3.3.0 EC : Decoding is failing when block group last incomplete cell fall in to AlignedStripe however, Trino is still using old version of Apache Hadoop. Is there any way we can push this forward?

@oneonestar
Copy link
Member Author

@mladjan-gadzic We faced the same problem before. We applied this patch to our internal Trino cluster and it works fine with EC.

There are a few problems we have to solve before this PR can be merged:

  • Help from a committer who can compile the necessary nativelibs
  • Find a way to run all Trino integration tests with the updated trino-hadoop-apache to ensure this won't break things accidentally
  • HDFS-16453 upgraded okhttp from 2.7.5 to 4.9.3. This brings the whole kotlin-stdlib into transitive dependency. We have to accept this or find a way to work around it. (Currently I shaded the kotlin lib)

@oneonestar oneonestar requested a review from ebyhr February 8, 2023 07:30
@mladjan-gadzic
Copy link

mladjan-gadzic commented Feb 8, 2023

@oneonestar thank you for a quick answer!

  • Help from a committer who can compile the necessary nativelibs

Unfortunately I am using macos with m1 chip which introduces whole new level of issues when tinkering around architecture dependent stuff. Because of this I am unable to compile native libs. But what I can do is I can check if someone from my organization can do that.

  • Find a way to run all Trino integration tests with the updated trino-hadoop-apache to ensure this won't break things accidentally

I will try to do this and get back to you.

EDIT: how are Trino integration tests usually run?

  • HDFS-16453 upgraded okhttp from 2.7.5 to 4.9.3. This brings the whole kotlin-stdlib into transitive dependency. We have to accept this or find a way to work around it. (Currently I shaded the kotlin lib)

What are the pros and cons of accepting kotlin-stdlib as transitive dependency? Are there any downsides for shading aside from what shading brings to the table itself?

@mladjan-gadzic
Copy link

Hi @oneonestar! Just a remainder to check up my comment. I am eager to help push this PR forward.

@ebyhr ebyhr requested a review from electrum February 22, 2023 07:22
@ebyhr ebyhr removed their request for review February 22, 2023 07:23
@dave-gantenbein
Copy link

@oneonestar @electrum - any chance we could get some eyes on this? Thanks in advance!

@KarlManong
Copy link

@c-rindi
Copy link

c-rindi commented May 4, 2023

Hello @electrum, @oneonestar, @ebyhr - checking in - any progress?

@electrum electrum mentioned this pull request Jun 13, 2023
@electrum
Copy link
Member

@oneonestar apologies for the very long delay on this. I updated your branch to use Hadoop 3.3.5 and copied the Hadoop native libraries from the official Hadoop releases (they have aarch64 now). I didn't bother to rebuild the other versions since there are only minimal changes which shouldn't affect testing on macOS. The tests passed in trinodb/trino#17869

@electrum electrum changed the title Update to Hadoop 3.3.4 Update to Hadoop 3.3.5 Jun 14, 2023
@electrum electrum merged commit 05242ca into trinodb:master Jun 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

8 participants