Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
199e59a
[SPARK-4881][Minor] Use SparkConf#getBoolean instead of get().toBoolean
sarutak Dec 24, 2014
29fabb1
SPARK-4297 [BUILD] Build warning fixes omnibus
srowen Dec 24, 2014
b4d0db8
[SPARK-4873][Streaming] Use `Future.zip` instead of `Future.flatMap`(…
zsxwing Dec 25, 2014
11dd993
[SPARK-4953][Doc] Fix the description of building Spark with YARN
sarutak Dec 25, 2014
08b18c7
Fix "Building Spark With Maven" link in README.md
dennyglee Dec 25, 2014
b6b6393
[EC2] Update default Spark version to 1.2.0
nchammas Dec 25, 2014
ac82785
[EC2] Update mesos/spark-ec2 branch to branch-1.3
nchammas Dec 25, 2014
f205fe4
[SPARK-4537][Streaming] Expand StreamingSource to add more metrics
jerryshao Dec 26, 2014
f9ed2b6
[SPARK-4608][Streaming] Reorganize StreamingContext implicit to impro…
zsxwing Dec 26, 2014
fda4331
SPARK-4971: Fix typo in BlockGenerator comment
CodingCat Dec 26, 2014
534f24b
MAINTENANCE: Automated closing of pull requests.
pwendell Dec 27, 2014
de95c57
[SPARK-3787][BUILD] Assembly jar name is wrong when we build with sbt…
sarutak Dec 27, 2014
82bf4be
HOTFIX: Slight tweak on previous commit.
pwendell Dec 27, 2014
2483c1e
[SPARK-3955] Different versions between jackson-mapper-asl and jackso…
jongyoul Dec 27, 2014
786808a
[SPARK-4954][Core] add spark version infomation in log for standalone…
liyezhang556520 Dec 27, 2014
080ceb7
[SPARK-4952][Core]Handle ConcurrentModificationExceptions in SparkEnv…
witgo Dec 27, 2014
a3e51cc
[SPARK-4501][Core] - Create build/mvn to automatically download maven…
Dec 27, 2014
14fa87b
[SPARK-4966][YARN]The MemoryOverhead value is setted not correctly
XuTingjun Dec 29, 2014
6645e52
[SPARK-4982][DOC] `spark.ui.retainedJobs` description is wrong in Spa…
wangxiaojing Dec 29, 2014
4cef05e
Adde LICENSE Header to build/mvn, build/sbt and sbt/sbt
sarutak Dec 29, 2014
815de54
[SPARK-4946] [CORE] Using AkkaUtils.askWithReply in MapOutputTracker.…
YanTangZhai Dec 29, 2014
8d72341
[Minor] Fix a typo of type parameter in JavaUtils.scala
sarutak Dec 29, 2014
02b55de
[SPARK-4409][MLlib] Additional Linear Algebra Utils
brkyvz Dec 29, 2014
9bc0df6
SPARK-4968: takeOrdered to skip reduce step in case mappers return no…
Dec 29, 2014
6cf6fdf
SPARK-4156 [MLLIB] EM algorithm for GMMs
tgaloppo Dec 29, 2014
343db39
Added setMinCount to Word2Vec.scala
ganonp Dec 29, 2014
040d6f2
[SPARK-4972][MLlib] Updated the scala doc for lasso and ridge regress…
Dec 30, 2014
9077e72
[SPARK-4920][UI] add version on master and worker page for standalone…
liyezhang556520 Dec 30, 2014
efa80a5
[SPARK-4882] Register PythonBroadcast with Kryo so that PySpark works…
JoshRosen Dec 30, 2014
480bd1d
[SPARK-4908][SQL] Prevent multiple concurrent hive native commands
marmbrus Dec 30, 2014
94d60b7
[SQL] enable view test
adrian-wang Dec 30, 2014
65357f1
[SPARK-4975][SQL] Fix HiveInspectorSuite test failure
scwf Dec 30, 2014
5595eaa
[SPARK-4959] [SQL] Attributes are case sensitive when using a select …
chenghao-intel Dec 30, 2014
63b84b7
[SPARK-4904] [SQL] Remove the unnecessary code change in Generic UDF
chenghao-intel Dec 30, 2014
daac221
[SPARK-5002][SQL] Using ascending by default when not specify order i…
scwf Dec 30, 2014
53f0a00
[Spark-4512] [SQL] Unresolved Attribute Exception in Sort By
chenghao-intel Dec 30, 2014
19a8802
[SPARK-4493][SQL] Tests for IsNull / IsNotNull in the ParquetFilterSuite
liancheng Dec 30, 2014
f7a41a0
[SPARK-4916][SQL][DOCS]Update SQL programming guide about cache section
luogankun Dec 30, 2014
2deac74
[SPARK-4930][SQL][DOCS]Update SQL programming guide, CACHE TABLE is e…
luogankun Dec 30, 2014
a75dd83
[SPARK-4928][SQL] Fix: Operator '>,<,>=,<=' with decimal between diff…
guowei2 Dec 30, 2014
61a99f6
[SPARK-4937][SQL] Normalizes conjunctions and disjunctions to elimina…
liancheng Dec 30, 2014
7425bec
[SPARK-4386] Improve performance when writing Parquet files
MickDavies Dec 30, 2014
8f29b7c
[SPARK-4935][SQL] When hive.cli.print.header configured, spark-sql ab…
scwf Dec 30, 2014
07fa191
[SPARK-4570][SQL]add BroadcastLeftSemiJoinHash
wangxiaojing Dec 30, 2014
b239ea1
SPARK-3955 part 2 [CORE] [HOTFIX] Different versions between jackson-…
srowen Dec 30, 2014
0f31992
[Spark-4995] Replace Vector.toBreeze.activeIterator with foreachActive
Dec 30, 2014
6a89782
[SPARK-4813][Streaming] Fix the issue that ContextWaiter didn't handl…
zsxwing Dec 30, 2014
035bac8
[SPARK-4998][MLlib]delete the "train" function
ljzzju Dec 30, 2014
352ed6b
[SPARK-1010] Clean up uses of System.setProperty in unit tests
JoshRosen Dec 31, 2014
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,19 @@
*.pyc
.idea/
.idea_modules/
sbt/*.jar
build/*.jar
.settings
.cache
cache
.generated-mima*
/build/
work/
out/
.DS_Store
third_party/libmesos.so
third_party/libmesos.dylib
build/apache-maven*
build/zinc*
build/scala*
conf/java-opts
conf/*.sh
conf/*.cmd
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ To build Spark and its example programs, run:

(You do not need to do this if you downloaded a pre-built package.)
More detailed documentation is available from the project site, at
["Building Spark with Maven"](http://spark.apache.org/docs/latest/building-with-maven.html).
["Building Spark with Maven"](http://spark.apache.org/docs/latest/building-spark.html).

## Interactive Scala Shell

Expand Down
149 changes: 149 additions & 0 deletions build/mvn
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
#!/usr/bin/env bash

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# Determine the current working directory
_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
# Preserve the calling directory
_CALLING_DIR="$(pwd)"

# Installs any application tarball given a URL, the expected tarball name,
# and, optionally, a checkable binary path to determine if the binary has
# already been installed
## Arg1 - URL
## Arg2 - Tarball Name
## Arg3 - Checkable Binary
install_app() {
local remote_tarball="$1/$2"
local local_tarball="${_DIR}/$2"
local binary="${_DIR}/$3"

# setup `curl` and `wget` silent options if we're running on Jenkins
local curl_opts=""
local wget_opts=""
if [ -n "$AMPLAB_JENKINS" ]; then
curl_opts="-s"
wget_opts="--quiet"
else
curl_opts="--progress-bar"
wget_opts="--progress=bar:force"
fi

if [ -z "$3" -o ! -f "$binary" ]; then
# check if we already have the tarball
# check if we have curl installed
# download application
[ ! -f "${local_tarball}" ] && [ -n "`which curl 2>/dev/null`" ] && \
echo "exec: curl ${curl_opts} ${remote_tarball}" && \
curl ${curl_opts} "${remote_tarball}" > "${local_tarball}"
# if the file still doesn't exist, lets try `wget` and cross our fingers
[ ! -f "${local_tarball}" ] && [ -n "`which wget 2>/dev/null`" ] && \
echo "exec: wget ${wget_opts} ${remote_tarball}" && \
wget ${wget_opts} -O "${local_tarball}" "${remote_tarball}"
# if both were unsuccessful, exit
[ ! -f "${local_tarball}" ] && \
echo -n "ERROR: Cannot download $2 with cURL or wget; " && \
echo "please install manually and try again." && \
exit 2
cd "${_DIR}" && tar -xzf "$2"
rm -rf "$local_tarball"
fi
}

# Install maven under the build/ folder
install_mvn() {
install_app \
"http://apache.claz.org/maven/maven-3/3.2.3/binaries" \
"apache-maven-3.2.3-bin.tar.gz" \
"apache-maven-3.2.3/bin/mvn"
MVN_BIN="${_DIR}/apache-maven-3.2.3/bin/mvn"
}

# Install zinc under the build/ folder
install_zinc() {
local zinc_path="zinc-0.3.5.3/bin/zinc"
[ ! -f "${zinc_path}" ] && ZINC_INSTALL_FLAG=1
install_app \
"http://downloads.typesafe.com/zinc/0.3.5.3" \
"zinc-0.3.5.3.tgz" \
"${zinc_path}"
ZINC_BIN="${_DIR}/${zinc_path}"
}

# Determine the Scala version from the root pom.xml file, set the Scala URL,
# and, with that, download the specific version of Scala necessary under
# the build/ folder
install_scala() {
# determine the Scala version used in Spark
local scala_version=`grep "scala.version" "${_DIR}/../pom.xml" | \
head -1 | cut -f2 -d'>' | cut -f1 -d'<'`
local scala_bin="${_DIR}/scala-${scala_version}/bin/scala"

install_app \
"http://downloads.typesafe.com/scala/${scala_version}" \
"scala-${scala_version}.tgz" \
"scala-${scala_version}/bin/scala"

SCALA_COMPILER="$(cd "$(dirname ${scala_bin})/../lib" && pwd)/scala-compiler.jar"
SCALA_LIBRARY="$(cd "$(dirname ${scala_bin})/../lib" && pwd)/scala-library.jar"
}

# Determines if a given application is already installed. If not, will attempt
# to install
## Arg1 - application name
## Arg2 - Alternate path to local install under build/ dir
check_and_install_app() {
# create the local environment variable in uppercase
local app_bin="`echo $1 | awk '{print toupper(\$0)}'`_BIN"
# some black magic to set the generated app variable (i.e. MVN_BIN) into the
# environment
eval "${app_bin}=`which $1 2>/dev/null`"

if [ -z "`which $1 2>/dev/null`" ]; then
install_$1
fi
}

# Setup healthy defaults for the Zinc port if none were provided from
# the environment
ZINC_PORT=${ZINC_PORT:-"3030"}

# Check and install all applications necessary to build Spark
check_and_install_app "mvn"

# Install the proper version of Scala and Zinc for the build
install_zinc
install_scala

# Reset the current working directory
cd "${_CALLING_DIR}"

# Now that zinc is ensured to be installed, check its status and, if its
# not running or just installed, start it
if [ -n "${ZINC_INSTALL_FLAG}" -o -z "`${ZINC_BIN} -status`" ]; then
${ZINC_BIN} -shutdown
${ZINC_BIN} -start -port ${ZINC_PORT} \
-scala-compiler "${SCALA_COMPILER}" \
-scala-library "${SCALA_LIBRARY}" &>/dev/null
fi

# Set any `mvn` options if not already present
export MAVEN_OPTS=${MAVEN_OPTS:-"-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m"}

# Last, call the `mvn` command as usual
${MVN_BIN} "$@"
128 changes: 128 additions & 0 deletions build/sbt
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
#!/usr/bin/env bash

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# When creating new tests for Spark SQL Hive, the HADOOP_CLASSPATH must contain the hive jars so
# that we can run Hive to generate the golden answer. This is not required for normal development
# or testing.
for i in "$HIVE_HOME"/lib/*
do HADOOP_CLASSPATH="$HADOOP_CLASSPATH:$i"
done
export HADOOP_CLASSPATH

realpath () {
(
TARGET_FILE="$1"

cd "$(dirname "$TARGET_FILE")"
TARGET_FILE="$(basename "$TARGET_FILE")"

COUNT=0
while [ -L "$TARGET_FILE" -a $COUNT -lt 100 ]
do
TARGET_FILE="$(readlink "$TARGET_FILE")"
cd $(dirname "$TARGET_FILE")
TARGET_FILE="$(basename $TARGET_FILE)"
COUNT=$(($COUNT + 1))
done

echo "$(pwd -P)/"$TARGET_FILE""
)
}

. "$(dirname "$(realpath "$0")")"/sbt-launch-lib.bash


declare -r noshare_opts="-Dsbt.global.base=project/.sbtboot -Dsbt.boot.directory=project/.boot -Dsbt.ivy.home=project/.ivy"
declare -r sbt_opts_file=".sbtopts"
declare -r etc_sbt_opts_file="/etc/sbt/sbtopts"

usage() {
cat <<EOM
Usage: $script_name [options]

-h | -help print this message
-v | -verbose this runner is chattier
-d | -debug set sbt log level to debug
-no-colors disable ANSI color codes
-sbt-create start sbt even if current directory contains no sbt project
-sbt-dir <path> path to global settings/plugins directory (default: ~/.sbt)
-sbt-boot <path> path to shared boot directory (default: ~/.sbt/boot in 0.11 series)
-ivy <path> path to local Ivy repository (default: ~/.ivy2)
-mem <integer> set memory options (default: $sbt_mem, which is $(get_mem_opts $sbt_mem))
-no-share use all local caches; no sharing
-no-global uses global caches, but does not use global ~/.sbt directory.
-jvm-debug <port> Turn on JVM debugging, open at the given port.
-batch Disable interactive mode

# sbt version (default: from project/build.properties if present, else latest release)
-sbt-version <version> use the specified version of sbt
-sbt-jar <path> use the specified jar as the sbt launcher
-sbt-rc use an RC version of sbt
-sbt-snapshot use a snapshot version of sbt

# java version (default: java from PATH, currently $(java -version 2>&1 | grep version))
-java-home <path> alternate JAVA_HOME

# jvm options and output control
JAVA_OPTS environment variable, if unset uses "$java_opts"
SBT_OPTS environment variable, if unset uses "$default_sbt_opts"
.sbtopts if this file exists in the current directory, it is
prepended to the runner args
/etc/sbt/sbtopts if this file exists, it is prepended to the runner args
-Dkey=val pass -Dkey=val directly to the java runtime
-J-X pass option -X directly to the java runtime
(-J is stripped)
-S-X add -X to sbt's scalacOptions (-S is stripped)
-PmavenProfiles Enable a maven profile for the build.

In the case of duplicated or conflicting options, the order above
shows precedence: JAVA_OPTS lowest, command line options highest.
EOM
}

process_my_args () {
while [[ $# -gt 0 ]]; do
case "$1" in
-no-colors) addJava "-Dsbt.log.noformat=true" && shift ;;
-no-share) addJava "$noshare_opts" && shift ;;
-no-global) addJava "-Dsbt.global.base=$(pwd)/project/.sbtboot" && shift ;;
-sbt-boot) require_arg path "$1" "$2" && addJava "-Dsbt.boot.directory=$2" && shift 2 ;;
-sbt-dir) require_arg path "$1" "$2" && addJava "-Dsbt.global.base=$2" && shift 2 ;;
-debug-inc) addJava "-Dxsbt.inc.debug=true" && shift ;;
-batch) exec </dev/null && shift ;;

-sbt-create) sbt_create=true && shift ;;

*) addResidual "$1" && shift ;;
esac
done

# Now, ensure sbt version is used.
[[ "${sbt_version}XXX" != "XXX" ]] && addJava "-Dsbt.version=$sbt_version"
}

loadConfigFile() {
cat "$1" | sed '/^\#/d'
}

# if sbtopts files exist, prepend their contents to $@ so it can be processed by this runner
[[ -f "$etc_sbt_opts_file" ]] && set -- $(loadConfigFile "$etc_sbt_opts_file") "$@"
[[ -f "$sbt_opts_file" ]] && set -- $(loadConfigFile "$sbt_opts_file") "$@"

run "$@"
6 changes: 3 additions & 3 deletions sbt/sbt-launch-lib.bash → build/sbt-launch-lib.bash
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,10 @@ dlog () {
}

acquire_sbt_jar () {
SBT_VERSION=`awk -F "=" '/sbt\\.version/ {print $2}' ./project/build.properties`
SBT_VERSION=`awk -F "=" '/sbt\.version/ {print $2}' ./project/build.properties`
URL1=http://typesafe.artifactoryonline.com/typesafe/ivy-releases/org.scala-sbt/sbt-launch/${SBT_VERSION}/sbt-launch.jar
URL2=http://repo.typesafe.com/typesafe/ivy-releases/org.scala-sbt/sbt-launch/${SBT_VERSION}/sbt-launch.jar
JAR=sbt/sbt-launch-${SBT_VERSION}.jar
JAR=build/sbt-launch-${SBT_VERSION}.jar

sbt_jar=$JAR

Expand Down Expand Up @@ -150,7 +150,7 @@ process_args () {
-java-home) require_arg path "$1" "$2" && java_cmd="$2/bin/java" && export JAVA_HOME=$2 && shift 2 ;;

-D*) addJava "$1" && shift ;;
-J*) addJava "${1:2}" && shift ;;
-J*) addJava "${1:2}" && shift ;;
-P*) enableProfile "$1" && shift ;;
*) addResidual "$1" && shift ;;
esac
Expand Down
4 changes: 2 additions & 2 deletions core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -352,9 +352,9 @@
</execution>
</executions>
<configuration>
<tasks>
<target>
<unzip src="../python/lib/py4j-0.8.2.1-src.zip" dest="../python/build" />
</tasks>
</target>
</configuration>
</plugin>
<plugin>
Expand Down
5 changes: 3 additions & 2 deletions core/src/main/scala/org/apache/spark/MapOutputTracker.scala
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,8 @@ private[spark] class MapOutputTrackerMasterActor(tracker: MapOutputTrackerMaster
*/
private[spark] abstract class MapOutputTracker(conf: SparkConf) extends Logging {
private val timeout = AkkaUtils.askTimeout(conf)
private val retryAttempts = AkkaUtils.numRetries(conf)
private val retryIntervalMs = AkkaUtils.retryWaitMs(conf)

/** Set to the MapOutputTrackerActor living on the driver. */
var trackerActor: ActorRef = _
Expand Down Expand Up @@ -108,8 +110,7 @@ private[spark] abstract class MapOutputTracker(conf: SparkConf) extends Logging
*/
protected def askTracker(message: Any): Any = {
try {
val future = trackerActor.ask(message)(timeout)
Await.result(future, timeout)
AkkaUtils.askWithReply(message, trackerActor, retryAttempts, retryIntervalMs, timeout)
} catch {
case e: Exception =>
logError("Error communicating with MapOutputTracker", e)
Expand Down
4 changes: 2 additions & 2 deletions core/src/main/scala/org/apache/spark/SecurityManager.scala
Original file line number Diff line number Diff line change
Expand Up @@ -151,8 +151,8 @@ private[spark] class SecurityManager(sparkConf: SparkConf) extends Logging with

private val authOn = sparkConf.getBoolean("spark.authenticate", false)
// keep spark.ui.acls.enable for backwards compatibility with 1.0
private var aclsOn = sparkConf.getOption("spark.acls.enable").getOrElse(
sparkConf.get("spark.ui.acls.enable", "false")).toBoolean
private var aclsOn =
sparkConf.getBoolean("spark.acls.enable", sparkConf.getBoolean("spark.ui.acls.enable", false))

// admin acls should be set before view or modify acls
private var adminAcls: Set[String] =
Expand Down
2 changes: 1 addition & 1 deletion core/src/main/scala/org/apache/spark/SparkEnv.scala
Original file line number Diff line number Diff line change
Expand Up @@ -395,7 +395,7 @@ object SparkEnv extends Logging {
val sparkProperties = (conf.getAll ++ schedulerMode).sorted

// System properties that are not java classpaths
val systemProperties = System.getProperties.iterator.toSeq
val systemProperties = Utils.getSystemProperties.toSeq
val otherProperties = systemProperties.filter { case (k, _) =>
k != "java.class.path" && !k.startsWith("spark.")
}.sorted
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ private[spark] object JavaUtils {
prev match {
case Some(k) =>
underlying match {
case mm: mutable.Map[a, _] =>
case mm: mutable.Map[A, _] =>
mm remove k
prev = None
case _ =>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,7 @@ private[spark] class Master(

override def preStart() {
logInfo("Starting Spark master at " + masterUrl)
logInfo(s"Running Spark version ${org.apache.spark.SPARK_VERSION}")
// Listen for remote client disconnection events, since they don't go through Akka's watch()
context.system.eventStream.subscribe(self, classOf[RemotingLifecycleEvent])
webUi.bind()
Expand Down
Loading