Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hive: Fix for missing table schema in map reduce job configurations #1557

Merged
merged 1 commit into from
Oct 8, 2020

Conversation

HotSushi
Copy link
Contributor

@HotSushi HotSushi commented Oct 6, 2020

Hive queries which spawn map reduce jobs are currently failing on live yarn clusters with the following stack trace:

2020-10-02 23:37:01,507 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: java.lang.NullPointerException
	at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
	at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:258)
	at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:705)
	at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:169)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:438)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:177)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:171)
Caused by: java.lang.NullPointerException
	at java.util.Objects.requireNonNull(Objects.java:203)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2296)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:111)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:54)
	at org.apache.iceberg.SchemaParser.fromJson(SchemaParser.java:247)
	at org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.initialize(IcebergInputFormat.java:176)
	at org.apache.iceberg.mr.mapred.MapredIcebergInputFormat$MapredIcebergRecordReader.<init>(MapredIcebergInputFormat.java:92)
	at org.apache.iceberg.mr.mapred.MapredIcebergInputFormat.getRecordReader(MapredIcebergInputFormat.java:78)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:255)
	... 9 more

This error occurs on the mappers and the reason for this failure is that the job configurations such as TABLE_SCHEMA, TABLE_LOCATION, TABLE_IDENTIFIER are not set correctly. The location of failure is here:

The outcome of this PR is that the job configs are set correctly and map reduce job succeeds this erroneous stage.

I'm not sure why this error is not reproducible in HiveRunner unit tests.

Steps to reproduce:

hive (default)> set iceberg.mr.catalog=hive;
hive (default)> add jar hdfs:///user/db/jars/iceberg-hive-runtime-0.0.4.jar;
hive (default)> SELECT * FROM default.customers ORDER BY customer_id DESC;

cc: @shardulm94. @omalley

@pvary
Copy link
Contributor

pvary commented Oct 7, 2020

This error occurs on the mappers and the reason for this failure is that the job configurations such as TABLE_SCHEMA, TABLE_LOCATION, TABLE_IDENTIFIER are not set correctly.

Which version of Hive you are using? Or this is query dependent?

Thanks for spotting the issue!
Peter

@HotSushi
Copy link
Contributor Author

HotSushi commented Oct 7, 2020

@pvary we're using hive 1.1. But I was not able to find any difference in the relevant code in Hive 2, which calls configureJobConf or configureInputJobProperties.

Queries which can run on the driver and doesn't spawn mr jobs succeed, the problem is only faced by queries such as DESC which needs mr jobs.

@pvary
Copy link
Contributor

pvary commented Oct 7, 2020

Queries which can run on the driver and doesn't spawn mr jobs succeed, the problem is only faced by queries such as DESC which needs mr jobs.

Makes sense.
It would be good to have a test case to prevent regression.
Are we able to provide a test case which fails before the fix and works after?
HiveIcebergStorageHandlerBaseTest.testJoinTables might be a good candidate for start.

Thanks, Peter

@rdblue
Copy link
Contributor

rdblue commented Oct 7, 2020

Looks reasonable to me, but will this affect jobs that run multiple scans in a single MR stage?

@massdosage, do we have HiveRunner tests for joins that run a two table scans in a stage?

@massdosage
Copy link
Contributor

massdosage commented Oct 8, 2020

Looks reasonable to me, but will this affect jobs that run multiple scans in a single MR stage?

@massdosage, do we have HiveRunner tests for joins that run a two table scans in a stage?

I think this does it: https://github.com/ExpediaGroup/iceberg/blob/master/mr/src/test/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandlerBaseTest.java#L145 as @pvary mentioned above. I know this caught an issue with multiple tables not being configured for the scan properly in the past but it's possible it doesn't capture all the cases that can occur.

@rdblue
Copy link
Contributor

rdblue commented Oct 8, 2020

Okay, if we do have a test case that does a simple join, then I think this should be okay. It doesn't sound like we can reproduce the issue with the newer Hive versions, though. So I'll merge this without adding a test for it.

@rdblue rdblue merged commit 13d94bc into apache:master Oct 8, 2020
@HotSushi HotSushi deleted the fix_job_confs_not_set branch October 8, 2020 21:18
marton-bod added a commit to marton-bod/iceberg that referenced this pull request Nov 2, 2020
rdblue pushed a commit that referenced this pull request Nov 2, 2020
)

This reverts commit 13d94bc.

Co-authored-by: Marton Bod <mbod@cloudera.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants