HIVE-21737: Upgrade Avro to version 1.10.1 #1635

iemejia · 2020-10-30T18:35:29Z

Co-Authored-By: Ismaël Mejía iemejia@gmail.com

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

iemejia · 2020-10-30T18:36:33Z

R: @sunchao

iemejia · 2020-10-30T18:37:29Z

Just for curiosity, patches in the ticket are not needed anymore?

sunchao · 2020-10-30T19:17:26Z

Thanks @iemejia . I also opened #1621 for the issue.

Just for curiosity, patches in the ticket are not needed anymore?

Yes Hive has switched to use github PR.

iemejia · 2020-11-03T08:14:02Z

@sunchao It seems the jenkins tests failed but I cannot get into jenkins to see what's going on, any hints?

sunchao · 2020-11-03T18:34:51Z

@iemejia can you open this link? the errors are related to Avro but it should be good now since #1621 has just been merged. Can you just upgrade the Avro version in this PR and try again? thanks.

iemejia · 2020-11-04T21:27:40Z

@sunchao great to see some of the changes merged. I cannot open the link. I got a 404 error now (before it was a sockets error who aborted the access eagerly).

I am not sure if I understand. I upgraded Avro to the latest stable release (1.10.0) but I expect this will trigger the failures previously discussed in the ticket (on ill defined schemas on avro_deserialize_map_null.q, parquet_map_null.q) and fixed by AVRO-2817 but that require of the 1.10.1 release.

Great to see #1621 merged, is there already a PR to cherry-pick that one into branch-2.3? Do you want me to do that?
I suppose however we might require also the changes on TypeInfoToSchema from this PR too to make it also forward compatible because the Field constructor with JsonNode as a parameter is not available after Avro 1.9.

sunchao · 2020-11-05T00:51:38Z

I got a 404 error now (before it was a sockets error who aborted the access eagerly).

It's weird, and I'm not sure how access is managed there. FWIW, here's the error message:

Caused by: java.lang.ClassNotFoundException: org.xerial.snappy.SnappyInputStream
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
	... 19 more

so you may need to double check dependencies and perhaps add the missing snappy one.

I am not sure if I understand. I upgraded Avro to the latest stable release (1.10.0) but I expect this will trigger the failures previously discussed in the ticket (on ill defined schemas on avro_deserialize_map_null.q, parquet_map_null.q) and fixed by AVRO-2817 but that require of the 1.10.1 release.

Seems this only triggers when you have "default:null" in schema? I don't see we're doing that in Hive though.

Great to see #1621 merged, is there already a PR to cherry-pick that one into branch-2.3? Do you want me to do that?
I suppose however we might require also the changes on TypeInfoToSchema from this PR too to make it also forward compatible because the Field constructor with JsonNode as a parameter is not available after Avro 1.9.

Not yet, yeah feel free to do that. Have you got a chance to test this against Spark with a upgraded Parquet version?

iemejia · 2020-11-06T22:32:52Z

so you may need to double check dependencies and perhaps add the missing snappy one.

Can you please share with me the full logs of the CI run, just to see what modules might require the dependency? Snappy is a optional runtime dependency now, so it is up to the project (Hive) to know where it uses it.

Seems this only triggers when you have "default:null" in schema? I don't see we're doing that in Hive though.

Yes exactly, but this can happen in the context of an union too, both parquet_map_null.q and avro_deserialize_map_null.q refer to map_null_val.avro so this seem to be the case.

Not yet, yeah feel free to do that. Have you got a chance to test this against Spark with a upgraded Parquet version?

I have a branch with everything on it Avro, Hive and Parquet updates in case you want to take a look, it fully compiles however I have not been able to run the full set of Spark tests locally yet because of unrelated errors on Kinesis (and lack of familiarity with the Spark test suite). I would like to test it with the full CI, but to do that I would need to have the Hive SNAPSHOTs somewhere, it is a pity Hive does not publish them for older versions, or are they published somewhere (that way i will update things there first)?

sunchao · 2020-11-07T00:01:02Z

Sure, the stack trace is:

java.lang.NoClassDefFoundError: org/xerial/snappy/SnappyInputStream
	at org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService$TestTaskSchedulerServiceWrapper.<init>(TestLlapTaskSchedulerService.java:2008)
	at org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testHostPreferenceMissesConsistentPartialAlive(TestLlapTaskSchedulerService.java:1077)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: org.xerial.snappy.SnappyInputStream
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
	... 14 more

seems all the failures are from org.apache.hadoop.hive.llap.tezplugins module.

Yes exactly, but this can happen in the context of an union too, both parquet_map_null.q and avro_deserialize_map_null.q refer to map_null_val.avro so this seem to be the case.

Ah I see. Great. I guess we haven't seen test failure related to this yet?

I have a branch with everything on it Avro, Hive and Parquet updates in case you want to take a look, it fully compiles however I have not been able to run the full set of Spark tests locally yet because of unrelated errors on Kinesis (and lack of familiarity with the Spark test suite). I would like to test it with the full CI, but to do that I would need to have the Hive SNAPSHOTs somewhere, it is a pity Hive does not publish them for older versions, or are they published somewhere (that way i will update things there first)?

Awesome! do you see any compile error without the Avro API fix? I don't think the snapshot jars are published anywhere so it could be hard to test this in Spark CI... However, we also plan to test this combination out in our internal Spark CI.

iemejia · 2020-12-03T14:52:15Z

PR updated to the last version of Avro let's see if this gets us more tests passing now. 🤞 @sunchao @wangyum

iemejia · 2020-12-03T14:54:19Z

Do you have an estimate of where a possible vote to get the previous changes (and hopefully this one) released as 2.3.8?
Thinking about Spark could take it in.

ql/pom.xml

sunchao · 2020-12-22T01:50:58Z

@iemejia please let me know if you think this is absolutely needed. I plan to start another vote on Hive 2.3.8 RC3 in the AM of 22th (PST).

h-vetinari · 2020-12-22T08:25:31Z

I'm not speaking for @iemejia of course, but my impression was that this patch was necessary for Hive 2.3.8 in order to (among other things) unblock spark from using a newer parquet (and thus avro) version.

See linked issues in HIVE-21737, SPARK-27733, etc.
The interlocking dependencies on this have been a very thorny problem that people have tried working on for a long time (e.g. #674, #785), so it would be really cool if this didn't miss the release.

sunchao · 2020-12-22T08:28:54Z

Yes, but my impression is that HIVE-24324 and HIVE-24436 should be sufficient to solve the issue. But let me know if otherwise.

h-vetinari · 2020-12-22T08:50:29Z

I'm just guessing myself, but based on the changes in apache/spark#30517, I think a bumped Avro dependency in Hive 2.3.8 is what people were planning with.

sunchao · 2020-12-22T08:56:52Z

Yes I'm aware of that PR. The Avro upgrade is in Spark though, and the testing is done with previous Hive release candidates which do not include the Avro upgrade.

iemejia · 2020-12-22T09:01:37Z

@sunchao Great to know the cut is happening soon!

If this is absolutely needed is a question of tradeoffs. Avro binary format has not changed since version 1.8.x when it introduced Logical Types, but more recent versions have removed dependencies from the public API (Jackson, Guava, JodaTime, etc) so I suppose catching up to the latest version can have the same risks as catching up to 1.8.x (which Hive already did) for the binary part.

I suppose Hive users rarely use Avro directly from the transitive dependency and mostly rely on the Hive APIs (which I hope don't leak Avro) so this might diminish the risk, but of course there is a risk in that particular case. I am probably biased towards the upgrade because I come from the Avro side and I expect the full Big Data ecosystem to be updated and avoid issues because Hive contributors may introduce changes that are API incompatible with more recent versions of Avro and break downstream projects because of this (like the current parallel work on Spark).

As usual in software it is all about tradeoffs. This decision is up to you guys as the maintainers and I might miss some other side effects in my analysis because I don't know Hive deeply.

In any case if you guys decide to jump to the latest Avro version e.g. 1.10.1 and any issue happens I engage myself from the Avro side to do any fix and get out a release if required.

sunchao · 2020-12-22T09:50:35Z

@iemejia thanks, can you open a PR against branch-2.3? let's see how the tests go first.

iemejia · 2020-12-22T14:26:33Z

@sunchao Done #1806 🤞

sunchao · 2020-12-22T18:36:56Z

BTW @iemejia I'm seeing this test consistently failing in all the test runs:

Can you take a look?

sunchao · 2020-12-22T19:02:55Z

You can try to reproduce this via:

mvn clean install -DskipTests
cd itests
mvn clean install -DskipTests
mvn test -Dtest=TestMiniDruidKafkaCliDriver -Dqfile=druidkafkamini_avro.q -Dtest.output.overwrite=true

although I got a different error:

FAILED: Execution Error, return code 40000 from org.apache.hadoop.hive.ql.ddl.DDLTask. org.skife.jdbi.v2.exceptions.UnableToObtainConnectionException: java.sql.SQLException: Cannot create         PoolableConnectionFactory (java.net.ConnectException : Error connecting to server localhost on port 8,084 with message Connection refused (Connection refused).)

It seems it's caused by this:

Exception in thread "main" java.lang.NoSuchMethodError: com.google.inject.util.Types.collectionOf(Ljava/lang/reflect/Type;)Ljava/lang/reflect/ParameterizedType;
» at com.google.inject.multibindings.Multibinder.collectionOfProvidersOf(Multibinder.java:202)
» at com.google.inject.multibindings.Multibinder$RealMultibinder.<init>(Multibinder.java:283)
» at com.google.inject.multibindings.Multibinder$RealMultibinder.<init>(Multibinder.java:258)
» at com.google.inject.multibindings.Multibinder.newRealSetBinder(Multibinder.java:178)
» at com.google.inject.multibindings.Multibinder.newSetBinder(Multibinder.java:150)
» at org.apache.druid.guice.LifecycleModule.getEagerBinder(LifecycleModule.java:115)
» at org.apache.druid.guice.LifecycleModule.configure(LifecycleModule.java:121)
» at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340)
» at com.google.inject.spi.Elements.getElements(Elements.java:110)
» at com.google.inject.util.Modules$OverrideModule.configure(Modules.java:177)
» at com.google.inject.AbstractModule.configure(AbstractModule.java:62)
» at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340)
» at com.google.inject.spi.Elements.getElements(Elements.java:110)
» at com.google.inject.util.Modules$OverrideModule.configure(Modules.java:177)
» at com.google.inject.AbstractModule.configure(AbstractModule.java:62)
» at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340)
» at com.google.inject.spi.Elements.getElements(Elements.java:110)
» at com.google.inject.internal.InjectorShell$Builder.build(InjectorShell.java:138)
» at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:104)
» at com.google.inject.Guice.createInjector(Guice.java:96)
» at com.google.inject.Guice.createInjector(Guice.java:73)
» at com.google.inject.Guice.createInjector(Guice.java:62)
» at org.apache.druid.initialization.Initialization.makeInjectorWithModules(Initialization.java:431)
» at org.apache.druid.cli.GuiceRunnable.makeInjector(GuiceRunnable.java:69)
» at org.apache.druid.cli.ServerRunnable.run(ServerRunnable.java:58)
» at org.apache.druid.cli.Main.main(Main.java:113)

which seems did not happen in the Jenkins CI.

wangyum · 2020-12-24T00:35:54Z

The test failure may by related to:

hive/itests/qtest-druid/pom.xml

Line 46 in f7ec294

<druid.guice.version>4.1.0</druid.guice.version>

https://github.com/apache/hadoop/blob/release-3.1.0-RC1/hadoop-project/pom.xml#L92

iemejia · 2020-12-24T08:24:44Z

I included @wangyum fix into this patch let's see if it works now 🤞

github-actions · 2021-02-23T00:47:15Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Feel free to reach out on the dev@hive.apache.org list if the patch is in need of reviews.

kgyrtkirk added the tests pending label Oct 30, 2020

kgyrtkirk added tests unstable and removed tests pending labels Oct 30, 2020

iemejia force-pushed the HIVE-21737-v2 branch from 4324f70 to 48f3906 Compare November 4, 2020 20:55

kgyrtkirk added tests pending and removed tests unstable labels Nov 4, 2020

kgyrtkirk added tests unstable and removed tests pending labels Nov 4, 2020

iemejia force-pushed the HIVE-21737-v2 branch from 48f3906 to 15eda39 Compare November 9, 2020 14:59

kgyrtkirk added tests pending tests unstable and removed tests unstable tests pending labels Nov 9, 2020

iemejia force-pushed the HIVE-21737-v2 branch from 15eda39 to 8421301 Compare December 3, 2020 14:50

kgyrtkirk added tests pending and removed tests unstable labels Dec 3, 2020

iemejia force-pushed the HIVE-21737-v2 branch from 8421301 to 682eb40 Compare December 3, 2020 14:50

wangyum reviewed Dec 3, 2020

View reviewed changes

ql/pom.xml Outdated Show resolved Hide resolved

iemejia force-pushed the HIVE-21737-v2 branch from 682eb40 to d3fcb7b Compare December 3, 2020 15:49

kgyrtkirk added tests unstable and removed tests pending labels Dec 21, 2020

h-vetinari mentioned this pull request Dec 22, 2020

[DO-NOT-MERGE] Test compatibility for Parquet 1.11.1, Avro 1.10.1 and Hive 2.3.8 apache/spark#30517

Closed

kgyrtkirk added tests pending and removed tests unstable labels Dec 22, 2020

kgyrtkirk added tests unstable and removed tests pending labels Dec 22, 2020

wangyum and others added 2 commits December 24, 2020 09:22

HIVE-24568: Fix guice compatibility issues

63ba401

HIVE-21737: Upgrade Avro to version 1.10.1

ecc40bd

iemejia force-pushed the HIVE-21737-v2 branch from b539bcf to ecc40bd Compare December 24, 2020 08:23

kgyrtkirk added tests pending and removed tests unstable labels Dec 24, 2020

iemejia mentioned this pull request Dec 24, 2020

HIVE-24568: Fix guice compatibility issues #1812

Closed

kgyrtkirk added tests unstable and removed tests pending labels Dec 24, 2020

github-actions bot added the stale label Feb 23, 2021

github-actions bot closed this Mar 3, 2021

HIVE-21737: Upgrade Avro to version 1.10.1 #1635

HIVE-21737: Upgrade Avro to version 1.10.1 #1635

Uh oh!

Conversation

iemejia commented Oct 30, 2020

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

iemejia commented Oct 30, 2020

Uh oh!

iemejia commented Oct 30, 2020

Uh oh!

sunchao commented Oct 30, 2020

Uh oh!

iemejia commented Nov 3, 2020

Uh oh!

sunchao commented Nov 3, 2020

Uh oh!

iemejia commented Nov 4, 2020

Uh oh!

sunchao commented Nov 5, 2020

Uh oh!

iemejia commented Nov 6, 2020

Uh oh!

sunchao commented Nov 7, 2020

Uh oh!

iemejia commented Dec 3, 2020

Uh oh!

iemejia commented Dec 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

sunchao commented Dec 22, 2020

Uh oh!

h-vetinari commented Dec 22, 2020

Uh oh!

sunchao commented Dec 22, 2020

Uh oh!

h-vetinari commented Dec 22, 2020

Uh oh!

sunchao commented Dec 22, 2020

Uh oh!

iemejia commented Dec 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sunchao commented Dec 22, 2020

Uh oh!

iemejia commented Dec 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sunchao commented Dec 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sunchao commented Dec 22, 2020

Uh oh!

wangyum commented Dec 24, 2020

Uh oh!

iemejia commented Dec 24, 2020

Uh oh!

github-actions bot commented Feb 23, 2021

Uh oh!

Uh oh!

iemejia commented Dec 3, 2020 •

edited

Loading

iemejia commented Dec 22, 2020 •

edited

Loading

iemejia commented Dec 22, 2020 •

edited

Loading

sunchao commented Dec 22, 2020 •

edited

Loading