-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-30492][SQL] Eliminate deprecation warnings in ORC datasource #27179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@dongjoon-hyun Please, take a look at this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, this breaks our Hive 1.2
code. Can we have a fix for both Hive 1.2
and Hive 2.3
?
[ERROR] [Error] /home/runner/work/spark/spark/sql/hive/src/main/java/org/apache/hadoop/hive/ql/io/orc/SparkOrcNewRecordReader.java:46: cannot find symbol
1420
symbol: method getSchema()
1421
location: variable file of type org.apache.hadoop.hive.ql.io.orc.Reader
Since we cannot drop Hive 1.2
completely at least in 3.0 (or maybe until 3.1), we need to support it still.
cc @srowen , @wangyum and @gatorsmile
@dongjoon-hyun Just in case, do you know why there are 2 ORC implementations:
Is it something specific for ORC? |
Historically,
So, with |
Test build #116568 has finished for PR 27179 at commit
|
If it's much trouble here... I'd just leave it. We're not going to be able to resolve 100% of warnings just for reasons like this. |
I would propose to add the In any case, we are going to deprecate |
@MaxGekk . Sorry, but I'm technically -1 to prevent a feature regression. I guess you are assuming that the new one supports all use cases of old one. However, it's not true. One simple long standing JIRA is https://issues.apache.org/jira/browse/SPARK-21997 . Users are still using the old ones because new ones (ORC and Parquet) don't provide the same feature. For me, this one is not worth of your time. We had better move on from this part. |
I didn't know that. @dongjoon-hyun Thank you for the explanation. I am closing this PR. |
What changes were proposed in this pull request?
In the PR, I propose to avoid usage of
getTypes()
in theSparkOrcNewRecordReader
constructor, and replace it bygetSchema()
.Why are the changes needed?
To eliminate compiler warnings, and highlight other warnings that could indicate about real problems:
Does this PR introduce any user-facing change?
No
How was this patch tested?
By existing tests from the
org.apache.spark.sql.hive.orc
package likeHiveOrcQuerySuite
.