Skip to content

Commit

Permalink
Merge pull request #207 from gisaia/feat/upgrade-deps
Browse files Browse the repository at this point in the history
Upgrade dependencies to spark 3.1.2 and compliant deps
  • Loading branch information
sfalquier authored Aug 3, 2021
2 parents 3f86130 + e517709 commit acf87f1
Show file tree
Hide file tree
Showing 6 changed files with 73 additions and 26 deletions.
55 changes: 42 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,18 +30,22 @@ A spark java application to submit a spark job on our spark cluster:

## Versions used in this project
It's very important to check the version of spark being used, here we are using the following:
- Spark 2.3.3 for Hadoop 2.7 with OpenJDK 8 (Java 1.8.0)
- Scala 2.11.8
- ScyllaDB 2.2.0
- Spark-cassandra-connector: 2.3.1-S_2.11
- Spark 3.1.2 for Hadoop 2.7 with OpenJDK 8 (Java 1.8.0)
- Scala 2.12.10

# Build and deploy application JAR

## Build locally

```bash
# Build jar
sbt clean assembly
# Run tests and build jar
docker run --rm \
-w /opt/work \
-v $PWD:/opt/work \
-v $HOME/.m2:/root/.m2 \
-v $HOME/.ivy2:/root/.ivy2 \
gisaia/sbt:1.5.5_jdk8 \
sbt clean test assembly
```

## Deploy JAR to Cloudsmith
Expand All @@ -59,16 +63,33 @@ export CLOUDSMITH_API_KEY="your-api-key"

As these values are personal, you may add them to your `.bash_profile` file. This way you won't need to define them again.


```bash
sbt clean publish
docker run --rm \
-w /opt/work \
-v $PWD:/opt/work \
-v $HOME/.m2:/root/.m2 \
-v $HOME/.ivy2:/root/.ivy2 \
-e CLOUDSMITH_USER=${CLOUDSMITH_USER} \
-e CLOUDSMITH_API_KEY=${CLOUDSMITH_API_KEY} \
gisaia/sbt:1.5.5_jdk8 \
sbt clean publish
```

## Release

If you have sufficient permissions on Github repository, simply type:

`sbt clean release`
```bash
docker run -ti \
-w /opt/work \
-v $PWD:/opt/work \
-v $HOME/.m2:/root/.m2 \
-v $HOME/.ivy2:/root/.ivy2 \
-e CLOUDSMITH_USER=${CLOUDSMITH_USER} \
-e CLOUDSMITH_API_KEY=${CLOUDSMITH_API_KEY} \
gisaia/sbt:1.5.5_jdk8 \
sbt clean release
```

You will be asked for the versions to use for release & next version.

Expand All @@ -78,19 +99,27 @@ You will be asked for the versions to use for release & next version.

Start an interactive spark-shell session. For example :
```bash
sbt clean assembly
# Build fat jar
docker run --rm \
-w /opt/work \
-v ${PWD}:/opt/work \
-v $HOME/.m2:/root/.m2 \
-v $HOME/.ivy2:/root/.ivy2 \
gisaia/sbt:1.5.5_jdk8 \
/bin/bash -c 'sbt clean assembly; cp target/scala-2.12/arlas-proc-assembly*.jar target/scala-2.12/arlas-proc-assembly.jar'
# Build spark-shell
docker run -ti \
-w /opt/work \
-v ${PWD}:/opt/proc \
-v $HOME/.m2:/root/.m2 \
-v $HOME/.ivy2:/root/.ivy2 \
-p "4040:4040" \
gisaia/spark:2.3.3 \
gisaia/spark:3.1.2 \
spark-shell \
--packages org.elasticsearch:elasticsearch-spark-20_2.11:7.4.2,org.geotools:gt-referencing:20.1,org.geotools:gt-geometry:20.1,org.geotools:gt-epsg-hsql:20.1 \
--packages org.elasticsearch:elasticsearch-spark-30_2.12:7.13.4,org.geotools:gt-referencing:20.1,org.geotools:gt-geometry:20.1,org.geotools:gt-epsg-hsql:20.1 \
--exclude-packages javax.media:jai_core \
--repositories https://repo.osgeo.org/repository/release/,https://dl.cloudsmith.io/public/gisaia/public/maven/,https://repository.jboss.org/maven2/ \
--jars /opt/proc/target/scala-2.11/arlas-proc-assembly-0.6.1-SNAPSHOT.jar \
--jars /opt/proc/target/scala-2.12/arlas-proc-assembly.jar \
--conf spark.driver.allowMultipleContexts="true" \
--conf spark.rpc.netty.dispatcher.numThreads="2"
```
Expand Down
28 changes: 22 additions & 6 deletions build.sbt
Original file line number Diff line number Diff line change
@@ -1,29 +1,39 @@
ThisBuild / version := (version in ThisBuild).value
ThisBuild / scalaVersion := "2.11.8"
ThisBuild / scalaVersion := "2.12.10"
ThisBuild / organization := "io.arlas"

javacOptions ++= Seq("-source", "1.8", "-target", "1.8", "-Xlint")
initialize := {
val _ = initialize.value
val javaVersion = sys.props("java.specification.version")
if (javaVersion != "1.8")
sys.error("Java 1.8 is required for this project. Found " + javaVersion + " instead")
}

resolvers += "osgeo" at "https://repo.osgeo.org/repository/release/"
resolvers += "gisaia" at "https://dl.cloudsmith.io/public/gisaia/public/maven/"
resolvers += "jboss" at "https://repository.jboss.org/maven2/"

val sparkVersion = "2.3.3"
val sparkVersion = "3.1.2"
val scalaTestVersion = "3.2.3"

val sparkSQL = "org.apache.spark" %% "spark-sql" % sparkVersion % "provided"
val sparkMLlib = "org.apache.spark" %% "spark-mllib" % sparkVersion % "provided"
val spark = Seq(sparkSQL,sparkMLlib)

val scalaTest = "org.scalatest" %% "scalatest" % "2.2.5" % Test
val scalaTest = "org.scalatest" %% "scalatest" % scalaTestVersion % Test
val scalaTestFlatSpec = "org.scalatest" %% "scalatest-flatspec" % scalaTestVersion % Test
val wiremockStandalone = "com.github.tomakehurst" % "wiremock-standalone" % "2.25.1" % Test
val tests = Seq(scalaTest, wiremockStandalone)
val tests = Seq(scalaTest, scalaTestFlatSpec, wiremockStandalone)

val elasticSearch = "org.elasticsearch" %% "elasticsearch-spark-20" % "7.4.2" % "provided"
val elasticSearch = "org.elasticsearch" %% "elasticsearch-spark-30" % "7.13.4" % "provided"
val elastic = Seq(elasticSearch)

val gtReferencing = "org.geotools" % "gt-referencing" % "20.1" % "provided" exclude("javax.media", "jai_core")
val gtGeometry = "org.geotools" % "gt-geometry" % "20.1" % "provided" exclude("javax.media", "jai_core")
val geotools = Seq(gtReferencing, gtGeometry)

val arlasMl = "io.arlas" %% "arlas-ml" % "0.1.2"
val arlasMl = "io.arlas" %% "arlas-ml" % "0.2.0"
val arlas = Seq(arlasMl)

lazy val arlasProc = (project in file("."))
Expand Down Expand Up @@ -59,6 +69,12 @@ lazy val arlasProcAssembly = project
},
addArtifact(artifact in (Compile, assembly), assembly)
)
ThisBuild / assemblyMergeStrategy := {
case "module-info.class" => MergeStrategy.discard
case x =>
val oldStrategy = (ThisBuild / assemblyMergeStrategy).value
oldStrategy(x)
}

//sbt-release
import ReleaseTransformations._
Expand Down
2 changes: 1 addition & 1 deletion project/build.properties
Original file line number Diff line number Diff line change
@@ -1 +1 @@
sbt.version=1.2.7
sbt.version=1.5.5
4 changes: 2 additions & 2 deletions project/plugins.sbt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.9")
addSbtPlugin("com.github.gseitz" % "sbt-release" % "1.0.11")
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "1.0.0")
addSbtPlugin("com.github.sbt" % "sbt-release" % "1.1.0")
addSbtPlugin("com.typesafe.sbt" % "sbt-ghpages" % "0.6.3")
5 changes: 3 additions & 2 deletions src/test/scala/io/arlas/data/transform/ArlasTest.scala
Original file line number Diff line number Diff line change
Expand Up @@ -32,11 +32,12 @@ import io.arlas.data.transform.timeseries._
import io.arlas.data.{DataFrameTester, TestSparkSession}
import org.apache.spark.sql.functions._
import org.apache.spark.sql.types._
import org.scalatest.{FlatSpec, Matchers}
import org.scalatest.matchers.should._
import org.scalatest.flatspec.AnyFlatSpec

import scala.collection.immutable.ListMap

trait ArlasTest extends FlatSpec with Matchers with TestSparkSession with DataFrameTester {
trait ArlasTest extends AnyFlatSpec with Matchers with TestSparkSession with DataFrameTester {

val dataModel = DataModel(timeFormat = "dd/MM/yyyy HH:mm:ssXXX")
val speedColumn = "speed"
Expand Down
5 changes: 3 additions & 2 deletions src/test/scala/io/arlas/data/utils/GeoToolTest.scala
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,12 @@
package io.arlas.data.utils

import org.locationtech.jts.geom.Coordinate
import org.scalatest.FlatSpec
import org.scalatest.flatspec.AnyFlatSpec
import org.scalatest.matchers.should._

import scala.util.Random

class GeoToolTest extends FlatSpec {
class GeoToolTest extends AnyFlatSpec {

"computeStandardDeviationEllipsis " should " compute the standard deviation ellipsis" in {

Expand Down

0 comments on commit acf87f1

Please sign in to comment.