Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Spark version #152

Closed
farquet opened this issue Jun 13, 2019 · 3 comments
Closed

Upgrade Spark version #152

farquet opened this issue Jun 13, 2019 · 3 comments
Assignees
Labels
compatibility Relates to platform or system compatibility enhancement New feature or request

Comments

@farquet
Copy link
Collaborator

farquet commented Jun 13, 2019

The current Spark benchmarks caused an issue to OpenJ9 ( #131 ), they limit the suite compatibility with latest JDK versions on Mac OS ( #127 ) and they crash on ia64 infrastructure ( #150 ).

Those problems increases the motivation to upgrade Spark and any other libraries present in the apache-spark subproject.

To avoid changing the existing benchmarks though, the best would be to create a different subproject and port the benchmarks one by one while ensuring they still work as expected. After ensuring that those benchmarks are as good (or better) than the other ones, we can deprecate or remove the old benchmarks.

@farquet farquet added compatibility Relates to platform or system compatibility enhancement New feature or request labels Jun 13, 2019
@ericcaspole
Copy link

FYI, we also discovered the version parsing problem when the java version is like "11" or "16" with no decimal places. This stack trace will occur:

java.lang.ExceptionInInitializerError
at org.apache.hadoop.util.StringUtils.(StringUtils.java:76)
at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362)
at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$29.apply(SparkContext.scala:985)
at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$29.apply(SparkContext.scala:985)
at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:177)
at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:177)
at scala.Option.map(Option.scala:146)
at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:177)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:196)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
at org.apache.spark.mllib.recommendation.ALS.run(ALS.scala:242)
at org.renaissance.apache.spark.Als.run(Als.scala:106)
at org.renaissance.jmh.JmhRenaissanceBenchmark.runOperation(JmhRenaissanceBenchmark.java:39)
at org.renaissance.apache.spark.generated.JmhAls_runOperation_jmhTest.runOperation_ss_jmhStub(JmhAls_runOperation_jmhTest.java:568)

This des not happen with an update version like "11.0.10" etc.

@lbulej
Copy link
Member

lbulej commented Apr 30, 2021

We have merged #242 to the master branch, moving to Spark 3.0.1. Moreover, it seems that we should have no problems moving to Spark 3.1.1 (see #247).

@farquet
Copy link
Collaborator Author

farquet commented Apr 30, 2021

Looks like this can be closed then.

@lbulej lbulej closed this as completed Apr 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compatibility Relates to platform or system compatibility enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants