Skip to content

Commit d18e680

Browse files
committed
Merge remote-tracking branch 'upstream/master'
Resolved conflict: project/SparkBuild.scala
2 parents b3b0ff1 + 9c24974 commit d18e680

File tree

320 files changed

+4462
-1733
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

320 files changed

+4462
-1733
lines changed

.rat-excludes

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,8 @@ slaves
2222
spark-env.sh
2323
spark-env.sh.template
2424
log4j-defaults.properties
25+
bootstrap-tooltip.js
26+
jquery-1.11.1.min.js
2527
sorttable.js
2628
.*txt
2729
.*json

README.md

Lines changed: 15 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,13 @@
11
# Apache Spark
22

3-
Lightning-Fast Cluster Computing - <http://spark.apache.org/>
3+
Spark is a fast and general cluster computing system for Big Data. It provides
4+
high-level APIs in Scala, Java, and Python, and an optimized engine that
5+
supports general computation graphs for data analysis. It also supports a
6+
rich set of higher-level tools including Spark SQL for SQL and structured
7+
data processing, MLLib for machine learning, GraphX for graph processing,
8+
and Spark Streaming.
9+
10+
<http://spark.apache.org/>
411

512

613
## Online Documentation
@@ -69,29 +76,28 @@ can be run using:
6976
Spark uses the Hadoop core library to talk to HDFS and other Hadoop-supported
7077
storage systems. Because the protocols have changed in different versions of
7178
Hadoop, you must build Spark against the same version that your cluster runs.
72-
You can change the version by setting the `SPARK_HADOOP_VERSION` environment
73-
when building Spark.
79+
You can change the version by setting `-Dhadoop.version` when building Spark.
7480

7581
For Apache Hadoop versions 1.x, Cloudera CDH MRv1, and other Hadoop
7682
versions without YARN, use:
7783

7884
# Apache Hadoop 1.2.1
79-
$ SPARK_HADOOP_VERSION=1.2.1 sbt/sbt assembly
85+
$ sbt/sbt -Dhadoop.version=1.2.1 assembly
8086

8187
# Cloudera CDH 4.2.0 with MapReduce v1
82-
$ SPARK_HADOOP_VERSION=2.0.0-mr1-cdh4.2.0 sbt/sbt assembly
88+
$ sbt/sbt -Dhadoop.version=2.0.0-mr1-cdh4.2.0 assembly
8389

8490
For Apache Hadoop 2.2.X, 2.1.X, 2.0.X, 0.23.x, Cloudera CDH MRv2, and other Hadoop versions
85-
with YARN, also set `SPARK_YARN=true`:
91+
with YARN, also set `-Pyarn`:
8692

8793
# Apache Hadoop 2.0.5-alpha
88-
$ SPARK_HADOOP_VERSION=2.0.5-alpha SPARK_YARN=true sbt/sbt assembly
94+
$ sbt/sbt -Dhadoop.version=2.0.5-alpha -Pyarn assembly
8995

9096
# Cloudera CDH 4.2.0 with MapReduce v2
91-
$ SPARK_HADOOP_VERSION=2.0.0-cdh4.2.0 SPARK_YARN=true sbt/sbt assembly
97+
$ sbt/sbt -Dhadoop.version=2.0.0-cdh4.2.0 -Pyarn assembly
9298

9399
# Apache Hadoop 2.2.X and newer
94-
$ SPARK_HADOOP_VERSION=2.2.0 SPARK_YARN=true sbt/sbt assembly
100+
$ sbt/sbt -Dhadoop.version=2.2.0 -Pyarn assembly
95101

96102
When developing a Spark application, specify the Hadoop version by adding the
97103
"hadoop-client" artifact to your project's dependencies. For example, if you're

assembly/pom.xml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@
3232
<packaging>pom</packaging>
3333

3434
<properties>
35+
<sbt.project.name>assembly</sbt.project.name>
3536
<spark.jar.dir>scala-${scala.binary.version}</spark.jar.dir>
3637
<spark.jar.basename>spark-assembly-${project.version}-hadoop${hadoop.version}.jar</spark.jar.basename>
3738
<spark.jar>${project.build.directory}/${spark.jar.dir}/${spark.jar.basename}</spark.jar>

bagel/pom.xml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,9 @@
2727

2828
<groupId>org.apache.spark</groupId>
2929
<artifactId>spark-bagel_2.10</artifactId>
30+
<properties>
31+
<sbt.project.name>bagel</sbt.project.name>
32+
</properties>
3033
<packaging>jar</packaging>
3134
<name>Spark Project Bagel</name>
3235
<url>http://spark.apache.org/</url>

bin/spark-class

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -110,9 +110,9 @@ export JAVA_OPTS
110110

111111
TOOLS_DIR="$FWDIR"/tools
112112
SPARK_TOOLS_JAR=""
113-
if [ -e "$TOOLS_DIR"/target/scala-$SCALA_VERSION/*assembly*[0-9Tg].jar ]; then
113+
if [ -e "$TOOLS_DIR"/target/scala-$SCALA_VERSION/spark-tools*[0-9Tg].jar ]; then
114114
# Use the JAR from the SBT build
115-
export SPARK_TOOLS_JAR=`ls "$TOOLS_DIR"/target/scala-$SCALA_VERSION/*assembly*[0-9Tg].jar`
115+
export SPARK_TOOLS_JAR=`ls "$TOOLS_DIR"/target/scala-$SCALA_VERSION/spark-tools*[0-9Tg].jar`
116116
fi
117117
if [ -e "$TOOLS_DIR"/target/spark-tools*[0-9Tg].jar ]; then
118118
# Use the JAR from the Maven build

core/pom.xml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,9 @@
2727

2828
<groupId>org.apache.spark</groupId>
2929
<artifactId>spark-core_2.10</artifactId>
30+
<properties>
31+
<sbt.project.name>core</sbt.project.name>
32+
</properties>
3033
<packaging>jar</packaging>
3134
<name>Spark Project Core</name>
3235
<url>http://spark.apache.org/</url>
@@ -111,6 +114,10 @@
111114
<groupId>org.xerial.snappy</groupId>
112115
<artifactId>snappy-java</artifactId>
113116
</dependency>
117+
<dependency>
118+
<groupId>net.jpountz.lz4</groupId>
119+
<artifactId>lz4</artifactId>
120+
</dependency>
114121
<dependency>
115122
<groupId>com.twitter</groupId>
116123
<artifactId>chill_${scala.binary.version}</artifactId>

0 commit comments

Comments
 (0)