Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions BuildTestAll.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
#!/bin/bash

start_location=`pwd`
exit_value=0
failures=""

# First Arg Directory
# Second Arg Command
test_command () {
for sys in "dse" "oss"
do
echo "### Testing $1/$sys $2 ###"
cd $start_location
cd $1/$sys
$2 || { exit_value=$?; echo "### $1/$sys $2 Failed ###"; failures=$failures+"$1/$sys $2 Failed"+$'\n'; }
done
}

for language in "java" "scala"
do
echo "### Testing $language Builds### "
echo "### Gradle ###"
test_command "$language/gradle" "gradle -q build"
echo "### SBT ###"
test_command "$language/sbt" "sbt --error assembly"
echo "### Maven ###"
test_command "$language/maven" "mvn -q package"
done
echo $failures
exit $exit_value



81 changes: 51 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,57 +1,79 @@
# Example projects for using DSE Analytics

These are template projects that illustrate how to build Spark Application written in Java or Scala with
Maven, SBT or Gradle. Navigate to project that implements simple write-to-/read-from-Cassandra
application with language and build tool of your choice.
These are template projects that illustrate how to build Spark Application written in Java or Scala
with Maven, SBT or Gradle which can be run on either DataStax Enterprise (DSE) or Apache Spark. The
example project implements a simple write-to-/read-from-Cassandra application for each language and
build tool.

## Dependencies

Compiling Spark Applications depends on `Apache Spark` and optionally on `Spark Cassandra Connector`
jars. Projects `dse` and `oss` show two different ways of supplying these dependencies.
Both projects are built and executed with similar commands.
Compiling Spark applications depends on Apache Spark and optionally on Spark Cassandra Connector
jars. Projects `dse` and `oss` show two different ways of supplying these dependencies. Both
projects are built and executed with similar commands.

### DSE
### DSE

If you are planning to execute your Spark Application on a DSE cluster, you can use `dse` project
template which will automatically download (and use during compilation) all jars available in DSE cluster.
Please mind DSE version specified in build file, it should should match the one in your cluster.
If you are planning to execute your Spark Application on a DSE cluster, you can use the `dse`
project template which will automatically download (and use during compilation) all jars available
in the DSE cluster. Please mind the DSE version specified in the build file; it should should match
the one in your cluster.

### OSS

If you are planning to execute your Spark Application against Open Source Apache Spark and Open Source
Apache Cassandra, use `oss` project template where all dependencies have to be specified manually in
build files. Please mind dependencies versions, these should match the ones in your execution environment.
If you are planning to execute your Spark Application against Open Source Apache Spark and Open
Source Apache Cassandra, use the `oss` project template where all dependencies have to be specified
manually in build files. Please mind the dependency versions; these should match the ones in your
execution environment.

For additional info about version compatibility please refer to
[Version Compatibility Table](https://github.com/datastax/spark-cassandra-connector#version-compatibility)
For additional info about version compatibility please refer to the Spark Cassandra Connector
[Version Compatibility Table](https://github.com/datastax/spark-cassandra-connector#version-compatibility).

### Additional dependencies

Prepared projects use extra plugins so additional dependencies can be included with your
application's jar. All you need to do is add dependencies in build configuration file.
Prepared projects use extra plugins so additional dependencies can be included with your
application's jar. All you need to do is add dependencies in the build configuration file.

## Building & running

### Sbt

```
sbt clean assembly
dse spark-submit --class com.datastax.spark.example.scala.WriteRead target/scala-2.10/writeRead-assembly-0.1.jar
```
Task | Command
-------------|------------
build | `sbt clean assembly`
run (Scala) | `dse spark-submit --class com.datastax.spark.example.WriteRead target/scala-2.10/writeRead-assembly-0.1.jar`
run (Java) | `dse spark-submit --class com.datastax.spark.example.WriteRead target/writeRead-assembly-0.1.jar`

### Gradle

```
gradle clean shadowJar
dse spark-submit --class com.datastax.spark.example.scala.WriteRead build/libs/writeRead-0.1.jar
```
Task | Command
--------------------|------------
build | `sbt clean assembly`
run (Scala, Java) | `dse spark-submit --class com.datastax.spark.example.WriteRead build/libs/writeRead-0.1-all.jar`

### Maven

```
mvn clean package
dse spark-submit --class com.datastax.spark.example.scala.WriteRead target/writeRead-0.1-dep.jar
```
Task | Command
--------------------|------------
build | `sbt clean assembly`
run (Scala, Java) | `dse spark-submit --class com.datastax.spark.example.WriteRead target/writeRead-0.1.jar`

Notes:

1. The above command example are for DSE. To run with open source Spark, use `spark-submit` instead
2. Also see included example script [BuildTestAll.sh](BuildTestAll.sh) which runs all combinations


## Running Integrated Tests

Integrated tests have been set up under a `test` task in each build system. To run
the tests, invoke the build system and then launch `test`. These tests demonstrate
how to run integrated embedded Cassandra as well as Local Spark from within your testing
environment.

Currently only Scala Testing examples are provided.

These tests should also function inside IDEs that are configured with the ability to run
the build system's tests.

## Support

Expand All @@ -67,4 +89,3 @@ http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.


5 changes: 2 additions & 3 deletions java/gradle/dse/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,7 @@ repositories {
}
}

def sparkVersion = "1.6.2"
def connectorVersion = "1.6.0"
def dseVersion = "5.0.4"

// The assembly configuration will cause jar to be included in assembled fat-jar
configurations {
Expand All @@ -32,7 +31,7 @@ configurations {
// Please make sure that following dependencies have versions corresponding to the ones in your cluster.
// Note that spark-cassandra-connector should be provided with '--packages' flag to spark-submit command.
dependencies {
provided "com.datastax.dse:dse-spark-dependencies:5.0.1"
provided "com.datastax.dse:dse-spark-dependencies:$dseVersion"
// assembly "org.apache.commons:commons-math3:3.6.1"
// assembly "org.apache.commons:commons-csv:1.0"
}
Expand Down
3 changes: 2 additions & 1 deletion java/maven/dse/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,14 @@

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<dse.version>5.0.4</dse.version>
</properties>

<dependencies>
<dependency>
<groupId>com.datastax.dse</groupId>
<artifactId>dse-spark-dependencies</artifactId>
<version>5.0.1</version>
<version>${dse.version}</version>
<scope>provided</scope>
</dependency>
<!-- Your dependencies, 'provided' are not included in jar -->
Expand Down
5 changes: 4 additions & 1 deletion java/sbt/dse/build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,11 @@ autoScalaLibrary := false

resolvers += "DataStax Repo" at "https://datastax.artifactoryonline.com/datastax/public-repos/"

val dseVersion = "5.0.4"

// Please make sure that following DSE version matches your DSE cluster version.
libraryDependencies += "com.datastax.dse" % "dse-spark-dependencies" % "5.0.1" % "provided"
// SBT 0.13.13 or greater required because of a dependency resolution bug
libraryDependencies += "com.datastax.dse" % "dse-spark-dependencies" % dseVersion % "provided"

//Your dependencies
//libraryDependencies += "org.apache.commons" % "commons-math3" % "3.6.1"
Expand Down
24 changes: 17 additions & 7 deletions scala/gradle/dse/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,6 @@ repositories {
}
}

def sparkVersion = "1.6.2"
def connectorVersion = "1.6.0"

// The assembly configuration will cause jar to be included in assembled fat-jar
configurations {
Expand All @@ -27,14 +25,26 @@ configurations {
configurations {
provided
compile.extendsFrom provided
testCompile.exclude group: 'org.slf4j', module: 'log4j-over-slf4j'
}

// Please make sure that following dependencies have versions corresponding to the ones in your cluster.
// Note that spark-cassandra-connector should be provided with '--packages' flag to spark-submit command.
def dseVersion = "5.0.4"

def scalaVersion = "2.10"
def scalaTestVersion = "3.0.0"
def connectorVersion = "1.6.0"
def jUnitVersion = "4.12"

// Please make sure that following DSE version matches your DSE cluster version.
dependencies {
provided "com.datastax.dse:dse-spark-dependencies:5.0.1"
// assembly "org.apache.commons:commons-math3:3.6.1"
// assembly "org.apache.commons:commons-csv:1.0"
provided "com.datastax.dse:dse-spark-dependencies:$dseVersion" exclude group: 'org.slf4j', module: 'slf4j-log4j12'
// assembly "org.apache.commons:commons-math3:3.6.1"
// assembly "org.apache.commons:commons-csv:1.0"

//Test Dependencies
testCompile "com.datastax.spark:spark-cassandra-connector-embedded_$scalaVersion:$connectorVersion"
testCompile "org.scalatest:scalatest_$scalaVersion:$scalaTestVersion"
testCompile "junit:junit:$jUnitVersion"
}

shadowJar {
Expand Down
1 change: 1 addition & 0 deletions scala/gradle/dse/src/test
19 changes: 14 additions & 5 deletions scala/gradle/oss/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,10 @@ repositories {

def sparkVersion = "1.6.2"
def connectorVersion = "1.6.0"
def scalaVersion = "2.10"
def scalaTestVersion = "3.0.0"
def cassandraVersion = "3.0.2"
def jUnitVersion = "4.12"

// The assembly configuration will cause jar to be included in assembled fat-jar
configurations {
Expand All @@ -29,13 +33,18 @@ configurations {
// Please make sure that following dependencies have versions corresponding to the ones in your cluster.
// Note that spark-cassandra-connector should be provided with '--packages' flag to spark-submit command.
dependencies {
provided 'org.scala-lang:scala-library:2.10.6'
provided "org.apache.spark:spark-core_2.10:$sparkVersion"
provided "org.apache.spark:spark-sql_2.10:$sparkVersion"
provided "org.apache.spark:spark-hive_2.10:$sparkVersion"
provided "com.datastax.spark:spark-cassandra-connector_2.10:$connectorVersion"
provided "org.apache.spark:spark-core_$scalaVersion:$sparkVersion"
provided "org.apache.spark:spark-sql_$scalaVersion:$sparkVersion"
provided "org.apache.spark:spark-hive_$scalaVersion:$sparkVersion"
provided "com.datastax.spark:spark-cassandra-connector_$scalaVersion:$connectorVersion"
// assembly "org.apache.commons:commons-math3:3.6.1"
// assembly "org.apache.commons:commons-csv:1.0"

//Test Dependencies
testCompile "com.datastax.spark:spark-cassandra-connector-embedded_$scalaVersion:$connectorVersion"
testCompile "org.scalatest:scalatest_$scalaVersion:$scalaTestVersion"
testCompile "org.apache.cassandra:cassandra-all:$cassandraVersion"
testCompile "junit:junit:$jUnitVersion"
}

shadowJar {
Expand Down
1 change: 1 addition & 0 deletions scala/gradle/oss/src/test
66 changes: 65 additions & 1 deletion scala/maven/dse/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -9,15 +9,30 @@

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<dse.version>5.0.4</dse.version>
<scala.version>2.10.6</scala.version>
<scala.main.version>2.10</scala.main.version>
<scalatest.version>3.0.0</scalatest.version>
<connector.version>1.6.0</connector.version>
<junit.version>4.12</junit.version>
</properties>

<dependencies>
<dependency>
<groupId>com.datastax.dse</groupId>
<artifactId>dse-spark-dependencies</artifactId>
<version>5.0.1</version>
<version>${dse.version}</version>
<scope>provided</scope>
<exclusions>
<exclusion>
<groupId>org.mortbay.jetty</groupId>
<artifactId>*</artifactId>
</exclusion>
<exclusion>
<groupId>javax.servlet</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- Your dependencies, 'provided' are not included in jar -->
<!--<dependency>-->
Expand All @@ -30,6 +45,26 @@
<!--<artifactId>commons-csv</artifactId>-->
<!--<version>1.0</version>-->
<!--</dependency>-->

<!-- Test Dependencies -->
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector-embedded_${scala.main.version}</artifactId>
<version>${connector.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_${scala.main.version}</artifactId>
<version>${scalatest.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>${junit.version}</version>
<scope>test</scope>
</dependency>
</dependencies>

<repositories>
Expand Down Expand Up @@ -79,6 +114,35 @@
</execution>
</executions>
</plugin>
<!-- Instructions from http://www.scalatest.org/user_guide/using_the_scalatest_maven_plugin -->
<!-- disable surefire -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.7</version>
<configuration>
<skipTests>true</skipTests>
</configuration>
</plugin>
<!-- enable scalatest -->
<plugin>
<groupId>org.scalatest</groupId>
<artifactId>scalatest-maven-plugin</artifactId>
<version>1.0</version>
<configuration>
<reportsDirectory>${project.build.directory}/surefire-reports</reportsDirectory>
<junitxml>.</junitxml>
<filereports>WDF TestSuite.txt</filereports>
</configuration>
<executions>
<execution>
<id>test</id>
<goals>
<goal>test</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
1 change: 1 addition & 0 deletions scala/maven/dse/src/test
Loading