Skip to content

Update #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 163 commits into from
Aug 20, 2014
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
163 commits
Select commit Hold shift + click to select a range
4878911
[SPARK-2875] [PySpark] [SQL] handle null in schemaRDD()
davies Aug 6, 2014
a6cd311
[SPARK-2678][Core][SQL] A workaround for SPARK-2678
liancheng Aug 6, 2014
d614967
[SPARK-2627] [PySpark] have the build enforce PEP 8 automatically
nchammas Aug 6, 2014
4e98236
SPARK-2566. Update ShuffleWriteMetrics incrementally
sryza Aug 6, 2014
25cff10
[SPARK-2852][MLLIB] API consistency for `mllib.feature`
mengxr Aug 6, 2014
e537b33
[PySpark] Add blanklines to Python docstrings so example code renders…
rnowling Aug 6, 2014
c6889d2
[HOTFIX][Streaming] Handle port collisions in flume polling test
andrewor14 Aug 6, 2014
4e00833
SPARK-2882: Spark build now checks local maven cache for dependencies
GregOwen Aug 6, 2014
17caae4
[SPARK-2583] ConnectionManager error reporting
sarutak Aug 7, 2014
4201d27
SPARK-2879 [BUILD] Use HTTPS to access Maven Central and other repos
srowen Aug 7, 2014
a263a7e
HOTFIX: Support custom Java 7 location
pwendell Aug 7, 2014
ffd1f59
[SPARK-2887] fix bug of countApproxDistinct() when have more than one…
davies Aug 7, 2014
47ccd5e
[SPARK-2851] [mllib] DecisionTree Python consistency update
jkbradley Aug 7, 2014
75993a6
SPARK-2879 part 2 [BUILD] Use HTTPS to access Maven Central and other…
srowen Aug 7, 2014
8d1dec4
[mllib] DecisionTree Strategy parameter checks
jkbradley Aug 7, 2014
b9e9e53
[SPARK-2852][MLLIB] Separate model from IDF/StandardScaler algorithms
mengxr Aug 7, 2014
80ec5ba
SPARK-2905 Fixed path sbin => bin
dosoft Aug 7, 2014
32096c2
SPARK-2899 Doc generation is back to working in new SBT Build.
ScrapCodes Aug 7, 2014
6906b69
SPARK-2787: Make sort-based shuffle write files directly when there's…
mateiz Aug 8, 2014
4c51098
SPARK-2565. Update ShuffleReadMetrics as blocks are fetched
sryza Aug 8, 2014
9de6a42
[SPARK-2904] Remove non-used local variable in SparkSubmitArguments
sarutak Aug 8, 2014
9a54de1
[SPARK-2911]: provide rdd.parent[T](j) to obtain jth parent RDD
erikerlandson Aug 8, 2014
9016af3
[SPARK-2888] [SQL] Fix addColumnMetadataToConf in HiveTableScan
yhuai Aug 8, 2014
0489cee
[SPARK-2908] [SQL] JsonRDD.nullTypeToStringType does not convert all …
yhuai Aug 8, 2014
c874723
[SPARK-2877] [SQL] MetastoreRelation should use SparkClassLoader when…
yhuai Aug 8, 2014
45d8f4d
[SPARK-2919] [SQL] Basic support for analyze command in HiveQl
yhuai Aug 8, 2014
b7c89a7
[SPARK-2700] [SQL] Hidden files (such as .impala_insert_staging) shou…
chutium Aug 8, 2014
74d6f62
[SPARK-1997][MLLIB] update breeze to 0.9
mengxr Aug 8, 2014
ec79063
[SPARK-2897][SPARK-2920]TorrentBroadcast does use the serializer clas…
witgo Aug 8, 2014
1c84dba
[Web UI]Make decision order of Worker's WebUI port consistent with Ma…
WangTaoTheTonic Aug 9, 2014
43af281
[SPARK-2911] apply parent[T](j) to clarify UnionRDD code
erikerlandson Aug 9, 2014
28dbae8
[SPARK-2635] Fix race condition at SchedulerBackend.isReady in standa…
li-zhihui Aug 9, 2014
b431e67
[SPARK-2861] Fix Doc comment of histogram method
Aug 9, 2014
e45daf2
[SPARK-1766] sorted functions to meet pedantic requirements
Aug 10, 2014
4f4a988
[SPARK-2894] spark-shell doesn't accept flags
sarutak Aug 10, 2014
5b6585d
Updated Spark SQL README to include the hive-thriftserver module
rxin Aug 10, 2014
482c5af
Turn UpdateBlockInfo into case class.
rxin Aug 10, 2014
3570119
Remove extra semicolon in Task.scala
witgo Aug 10, 2014
1d03a26
[SPARK-2950] Add gc time and shuffle write time to JobLogger
shivaram Aug 10, 2014
28dcbb5
[SPARK-2898] [PySpark] fix bugs in deamon.py
davies Aug 10, 2014
b715aa0
[SPARK-2937] Separate out samplyByKeyExact as its own API in PairRDDF…
dorx Aug 10, 2014
ba28a8f
[SPARK-2936] Migrate Netty network module from Java to Scala
rxin Aug 11, 2014
db06a81
[PySpark] [SPARK-2954] [SPARK-2948] [SPARK-2910] [SPARK-2101] Python …
JoshRosen Aug 11, 2014
3733866
[SPARK-2952] Enable logging actor messages at DEBUG level
rxin Aug 11, 2014
7712e72
[SPARK-2931] In TaskSetManager, reset currentLocalityIndex after reco…
JoshRosen Aug 12, 2014
32638b5
[SPARK-2515][mllib] Chi Squared test
dorx Aug 12, 2014
6fab941
[SPARK-2934][MLlib] Adding LogisticRegressionWithLBFGS Interface
Aug 12, 2014
490ecfa
[SPARK-2844][SQL] Correctly set JVM HiveContext if it is passed into …
ahirreddy Aug 12, 2014
21a95ef
[SPARK-2590][SQL] Added option to handle incremental collection, disa…
liancheng Aug 12, 2014
e83fdcd
[sql]use SparkSQLEnv.stop() in ShutdownHook
scwf Aug 12, 2014
647aeba
[SQL] A tiny refactoring in HiveContext#analyze
yhuai Aug 12, 2014
c9c89c3
[SPARK-2965][SQL] Fix HashOuterJoin output nullabilities.
ueshin Aug 12, 2014
c686b7d
[SPARK-2968][SQL] Fix nullabilities of Explode.
ueshin Aug 12, 2014
bad21ed
[SPARK-2650][SQL] Build column buffers in smaller batches
marmbrus Aug 12, 2014
5d54d71
[SQL] [SPARK-2826] Reduce the memory copy while building the hashmap …
chenghao-intel Aug 12, 2014
9038d94
[SPARK-2923][MLLIB] Implement some basic BLAS routines
mengxr Aug 12, 2014
f0060b7
[MLlib] Correctly set vectorSize and alpha
Ishiihara Aug 12, 2014
882da57
fix flaky tests
davies Aug 12, 2014
c235b83
SPARK-2830 [MLlib]: re-organize mllib documentation
atalwalkar Aug 13, 2014
676f982
[SPARK-2953] Allow using short names for io compression codecs
rxin Aug 13, 2014
246cb3f
Use transferTo when copy merge files in ExternalSorter
colorant Aug 13, 2014
2bd8126
[SPARK-1777 (partial)] bugfix: make size of requested memory correctly
liyezhang556520 Aug 13, 2014
fe47359
[SPARK-2993] [MLLib] colStats (wrapper around MultivariateStatistical…
dorx Aug 13, 2014
869f06c
[SPARK-2963] [SQL] There no documentation about building to use HiveS…
sarutak Aug 13, 2014
c974a71
[SPARK-3013] [SQL] [PySpark] convert array into list
davies Aug 13, 2014
434bea1
[SPARK-2983] [PySpark] improve performance of sortByKey()
davies Aug 13, 2014
7ecb867
[MLLIB] use Iterator.fill instead of Array.fill
mengxr Aug 13, 2014
bdc7a1a
[SPARK-3004][SQL] Added null checking when retrieving row set
liancheng Aug 13, 2014
13f54e2
[SPARK-2817] [SQL] add "show create table" support
tianyi Aug 13, 2014
9256d4a
[SPARK-2994][SQL] Support for udfs that take complex types
marmbrus Aug 14, 2014
376a82e
[SPARK-2650][SQL] More precise initial buffer size estimation for in-…
liancheng Aug 14, 2014
9fde1ff
[SPARK-2935][SQL]Fix parquet predicate push down bug
marmbrus Aug 14, 2014
905dc4b
[SPARK-2970] [SQL] spark-sql script ends with IOException when EventL…
sarutak Aug 14, 2014
63d6777
[SPARK-2986] [SQL] fixed: setting properties does not effect
Aug 14, 2014
0c7b452
SPARK-3020: Print completed indices rather than tasks in web UI
pwendell Aug 14, 2014
9497b12
[SPARK-3006] Failed to execute spark-shell in Windows OS
tsudukim Aug 14, 2014
e424565
[Docs] Add missing <code> tags (minor)
andrewor14 Aug 14, 2014
69a57a1
[SPARK-2995][MLLIB] add ALS.setIntermediateRDDStorageLevel
mengxr Aug 14, 2014
d069c5d
[SPARK-3029] Disable local execution of Spark jobs by default
aarondav Aug 14, 2014
6b8de0e
SPARK-2893: Do not swallow Exceptions when running a custom kryo regi…
GrahamDennis Aug 14, 2014
078f3fb
[SPARK-3011][SQL] _temporary directory should be filtered out by sqlC…
josephsu Aug 14, 2014
add75d4
[SPARK-2927][SQL] Add a conf to configure if we always read Binary co…
yhuai Aug 14, 2014
fde692b
[SQL] Python JsonRDD UTF8 Encoding Fix
ahirreddy Aug 14, 2014
267fdff
[SPARK-2925] [sql]fix spark-sql and start-thriftserver shell bugs whe…
scwf Aug 14, 2014
eaeb0f7
Minor cleanup of metrics.Source
rxin Aug 14, 2014
9622106
[SPARK-2979][MLlib] Improve the convergence rate by minimizing the co…
Aug 14, 2014
a7f8a4f
Revert [SPARK-3011][SQL] _temporary directory should be filtered out…
marmbrus Aug 14, 2014
a75bc7a
SPARK-3009: Reverted readObject method in ApplicationInfo so that App…
jacek-lewandowski Aug 14, 2014
fa5a08e
Make dev/mima runnable on Mac OS X.
rxin Aug 14, 2014
655699f
[SPARK-3027] TaskContext: tighten visibility and provide Java friendl…
rxin Aug 15, 2014
3a8b68b
[SPARK-2468] Netty based block server / client module
rxin Aug 15, 2014
9422a9b
[SPARK-2736] PySpark converter and example script for reading Avro files
kanzhang Aug 15, 2014
500f84e
[SPARK-2912] [Spark QA] Include commit hash in Spark QA messages
nchammas Aug 15, 2014
e1b85f3
SPARK-2955 [BUILD] Test code fails to compile with "mvn compile" with…
srowen Aug 15, 2014
fba8ec3
Add caching information to rdd.toDebugString
Aug 15, 2014
7589c39
[SPARK-2924] remove default args to overloaded methods
avati Aug 15, 2014
fd9fcd2
Revert "[SPARK-2468] Netty based block server / client module"
pwendell Aug 15, 2014
0afe5cb
SPARK-3028. sparkEventToJson should support SparkListenerExecutorMetr…
sryza Aug 15, 2014
c703229
[SPARK-3022] [SPARK-3041] [mllib] Call findBins once per level + unor…
jkbradley Aug 15, 2014
cc36487
[SPARK-3046] use executor's class loader as the default serializer cl…
rxin Aug 16, 2014
5d25c0b
[SPARK-3078][MLLIB] Make LRWithLBFGS API consistent with others
mengxr Aug 16, 2014
2e069ca
[SPARK-3001][MLLIB] Improve Spearman's correlation
mengxr Aug 16, 2014
c9da466
[SPARK-3015] Block on cleaning tasks to prevent Akka timeouts
andrewor14 Aug 16, 2014
a83c772
[SPARK-3045] Make Serializer interface Java friendly
rxin Aug 16, 2014
20fcf3d
[SPARK-2977] Ensure ShuffleManager is created before ShuffleBlockManager
JoshRosen Aug 16, 2014
b4a0592
[SQL] Using safe floating-point numbers in doctest
liancheng Aug 16, 2014
4bdfaa1
[SPARK-3076] [Jenkins] catch & report test timeouts
nchammas Aug 16, 2014
76fa0ea
[SPARK-2677] BasicBlockFetchIterator#next can wait forever
sarutak Aug 16, 2014
7e70708
[SPARK-3048][MLLIB] add LabeledPoint.parse and remove loadStreamingLa…
mengxr Aug 16, 2014
ac6411c
[SPARK-3081][MLLIB] rename RandomRDDGenerators to RandomRDDs
mengxr Aug 16, 2014
379e758
[SPARK-3035] Wrong example with SparkContext.addFile
iAmGhost Aug 16, 2014
2fc8aca
[SPARK-1065] [PySpark] improve supporting for large broadcast
davies Aug 16, 2014
bc95fe0
In the stop method of ConnectionManager to cancel the ackTimeoutMonitor
witgo Aug 17, 2014
fbad722
[SPARK-3077][MLLIB] fix some chisq-test
mengxr Aug 17, 2014
73ab7f1
[SPARK-3042] [mllib] DecisionTree Filter top-down instead of bottom-up
jkbradley Aug 17, 2014
318e28b
SPARK-2881. Upgrade snappy-java to 1.1.1.3.
pwendell Aug 18, 2014
5ecb08e
Revert "[SPARK-2970] [SQL] spark-sql script ends with IOException whe…
marmbrus Aug 18, 2014
bfa09b0
[SQL] Improve debug logging and toStrings.
marmbrus Aug 18, 2014
9924328
[SPARK-1981] updated streaming-kinesis.md
cfregly Aug 18, 2014
95470a0
[HOTFIX][STREAMING] Allow the JVM/Netty to decide which port to bind …
harishreedharan Aug 18, 2014
c77f406
[SPARK-3087][MLLIB] fix col indexing bug in chi-square and add a chec…
mengxr Aug 18, 2014
5173f3c
SPARK-2884: Create binary builds in parallel with release script.
pwendell Aug 18, 2014
df652ea
SPARK-2900. aggregate inputBytes per stage
sryza Aug 18, 2014
3c8fa50
[SPARK-3097][MLlib] Word2Vec performance improvement
Ishiihara Aug 18, 2014
eef779b
[SPARK-2842][MLlib]Word2Vec documentation
Ishiihara Aug 18, 2014
9306b8c
[MLlib] Remove transform(dataset: RDD[String]) from Word2Vec public API
Ishiihara Aug 18, 2014
c0cbbde
SPARK-3093 : masterLock in Worker is no longer need
CrazyJvm Aug 18, 2014
f45efbb
[SPARK-2862] histogram method fails on some choices of bucketCount
Aug 18, 2014
7ae28d1
SPARK-3096: Include parquet hive serde by default in build
pwendell Aug 18, 2014
6a13dca
[SPARK-3084] [SQL] Collect broadcasted tables in parallel in joins
mateiz Aug 18, 2014
4bf3de7
[SPARK-3085] [SQL] Use compact data structures in SQL joins
mateiz Aug 18, 2014
6bca889
SPARK-3025 [SQL]: Allow JDBC clients to set a fair scheduler pool
pwendell Aug 18, 2014
9eb74c7
[SPARK-3091] [SQL] Add support for caching metadata on Parquet files
mateiz Aug 18, 2014
3abd0c1
[SPARK-2406][SQL] Initial support for using ParquetTableScan to read …
marmbrus Aug 18, 2014
66ade00
[SPARK-2169] Don't copy appName / basePath everywhere.
Aug 18, 2014
3a5962f
Removed .travis.yml file since we are not using Travis.
rxin Aug 18, 2014
d1d0ee4
[SPARK-3103] [PySpark] fix saveAsTextFile() with utf-8
davies Aug 18, 2014
6201b27
[SPARK-2718] [yarn] Handle quotes and other characters in user args.
Aug 18, 2014
115eeb3
[mllib] DecisionTree: treeAggregate + Python example bug fix
jkbradley Aug 18, 2014
c8b16ca
[SPARK-2850] [SPARK-2626] [mllib] MLlib stats examples + small fixes
jkbradley Aug 19, 2014
217b5e9
[SPARK-3108][MLLIB] add predictOnValues to StreamingLR and fix predictOn
mengxr Aug 19, 2014
1f1819b
[SPARK-3114] [PySpark] Fix Python UDFs in Spark SQL.
JoshRosen Aug 19, 2014
8257733
[SPARK-3116] Remove the excessive lockings in TorrentBroadcast
rxin Aug 19, 2014
cd0720c
Fix typo in decision tree docs
emef Aug 19, 2014
7eb9cbc
[SPARK-3072] YARN - Exit when reach max number failed executors
tgravescs Aug 19, 2014
cbfc26b
[SPARK-3089] Fix meaningless error message in ConnectionManager
sarutak Aug 19, 2014
31f0b07
[SPARK-3128][MLLIB] Use streaming test suite for StreamingLR
freeman-lab Aug 19, 2014
94053a7
SPARK-2333 - spark_ec2 script should allow option for existing securi…
vidaha Aug 19, 2014
76eaeb4
Move a bracket in validateSettings of SparkConf
SaintBacchus Aug 19, 2014
d7e80c2
[SPARK-2790] [PySpark] fix zip with serializers which have different …
davies Aug 19, 2014
825d4fe
[SPARK-3136][MLLIB] Create Java-friendly methods in RandomRDDs
mengxr Aug 19, 2014
8b9dc99
[SPARK-2468] Netty based block server / client module
rxin Aug 20, 2014
1870dba
[MLLIB] minor update to word2vec
mengxr Aug 20, 2014
c7252b0
[SPARK-3112][MLLIB] Add documentation and example for StreamingLR
freeman-lab Aug 20, 2014
0e3ab94
[SQL] add note of use synchronizedMap in SQLConf
scwf Aug 20, 2014
068b6fe
[SPARK-3130][MLLIB] detect negative values in naive Bayes
mengxr Aug 20, 2014
fce5c0f
[HOTFIX][Streaming][MLlib] use temp folder for checkpoint
mengxr Aug 20, 2014
8adfbc2
[SPARK-3119] Re-implementation of TorrentBroadcast.
rxin Aug 20, 2014
0a984aa
[SPARK-3142][MLLIB] output shuffle data directly in Word2Vec
mengxr Aug 20, 2014
ebcb94f
[SPARK-2974] [SPARK-2975] Fix two bugs related to spark.local.dirs
JoshRosen Aug 20, 2014
8a74e4b
[DOCS] Fixed wrong links
giwa Aug 20, 2014
0a7ef63
[SPARK-3141] [PySpark] fix sortByKey() with take()
davies Aug 20, 2014
8c5a222
[SPARK-3054][STREAMING] Add unit tests for Spark Sink.
harishreedharan Aug 20, 2014
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions .rat-excludes
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ log4j-defaults.properties
bootstrap-tooltip.js
jquery-1.11.1.min.js
sorttable.js
.*avsc
.*txt
.*json
.*data
Expand Down
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,15 @@ If your project is built with Maven, add this to your POM file's `<dependencies>
</dependency>


## A Note About Thrift JDBC server and CLI for Spark SQL

Spark SQL supports Thrift JDBC server and CLI.
See sql-programming-guide.md for more information about those features.
You can use those features by setting `-Phive-thriftserver` when building Spark as follows.

$ sbt/sbt -Phive-thriftserver assembly


## Configuration

Please refer to the [Configuration guide](http://spark.apache.org/docs/latest/configuration.html)
Expand Down
29 changes: 7 additions & 22 deletions bin/beeline
Original file line number Diff line number Diff line change
Expand Up @@ -17,29 +17,14 @@
# limitations under the License.
#

# Figure out where Spark is installed
FWDIR="$(cd `dirname $0`/..; pwd)"
#
# Shell script for starting BeeLine

# Find the java binary
if [ -n "${JAVA_HOME}" ]; then
RUNNER="${JAVA_HOME}/bin/java"
else
if [ `command -v java` ]; then
RUNNER="java"
else
echo "JAVA_HOME is not set" >&2
exit 1
fi
fi
# Enter posix mode for bash
set -o posix

# Compute classpath using external script
classpath_output=$($FWDIR/bin/compute-classpath.sh)
if [[ "$?" != "0" ]]; then
echo "$classpath_output"
exit 1
else
CLASSPATH=$classpath_output
fi
# Figure out where Spark is installed
FWDIR="$(cd `dirname $0`/..; pwd)"

CLASS="org.apache.hive.beeline.BeeLine"
exec "$RUNNER" -cp "$CLASSPATH" $CLASS "$@"
exec "$FWDIR/bin/spark-class" $CLASS "$@"
18 changes: 14 additions & 4 deletions bin/pyspark
Original file line number Diff line number Diff line change
Expand Up @@ -23,12 +23,18 @@ FWDIR="$(cd `dirname $0`/..; pwd)"
# Export this as SPARK_HOME
export SPARK_HOME="$FWDIR"

source $FWDIR/bin/utils.sh

SCALA_VERSION=2.10

if [[ "$@" = *--help ]] || [[ "$@" = *-h ]]; then
function usage() {
echo "Usage: ./bin/pyspark [options]" 1>&2
$FWDIR/bin/spark-submit --help 2>&1 | grep -v Usage 1>&2
exit 0
}

if [[ "$@" = *--help ]] || [[ "$@" = *-h ]]; then
usage
fi

# Exit if the user hasn't compiled Spark
Expand Down Expand Up @@ -66,10 +72,11 @@ fi
# Build up arguments list manually to preserve quotes and backslashes.
# We export Spark submit arguments as an environment variable because shell.py must run as a
# PYTHONSTARTUP script, which does not take in arguments. This is required for IPython notebooks.

SUBMIT_USAGE_FUNCTION=usage
gatherSparkSubmitOpts "$@"
PYSPARK_SUBMIT_ARGS=""
whitespace="[[:space:]]"
for i in "$@"; do
for i in "${SUBMISSION_OPTS[@]}"; do
if [[ $i =~ \" ]]; then i=$(echo $i | sed 's/\"/\\\"/g'); fi
if [[ $i =~ $whitespace ]]; then i=\"$i\"; fi
PYSPARK_SUBMIT_ARGS="$PYSPARK_SUBMIT_ARGS $i"
Expand All @@ -90,7 +97,10 @@ fi
if [[ "$1" =~ \.py$ ]]; then
echo -e "\nWARNING: Running python applications through ./bin/pyspark is deprecated as of Spark 1.0." 1>&2
echo -e "Use ./bin/spark-submit <python file>\n" 1>&2
exec $FWDIR/bin/spark-submit "$@"
primary=$1
shift
gatherSparkSubmitOpts "$@"
exec $FWDIR/bin/spark-submit "${SUBMISSION_OPTS[@]}" $primary "${APPLICATION_OPTS[@]}"
else
# Only use ipython if no command line arguments were provided [SPARK-1134]
if [[ "$IPYTHON" = "1" ]]; then
Expand Down
20 changes: 14 additions & 6 deletions bin/spark-shell
Original file line number Diff line number Diff line change
Expand Up @@ -31,13 +31,21 @@ set -o posix
## Global script variables
FWDIR="$(cd `dirname $0`/..; pwd)"

function usage() {
echo "Usage: ./bin/spark-shell [options]"
$FWDIR/bin/spark-submit --help 2>&1 | grep -v Usage 1>&2
exit 0
}

if [[ "$@" = *--help ]] || [[ "$@" = *-h ]]; then
echo "Usage: ./bin/spark-shell [options]"
$FWDIR/bin/spark-submit --help 2>&1 | grep -v Usage 1>&2
exit 0
usage
fi

function main(){
source $FWDIR/bin/utils.sh
SUBMIT_USAGE_FUNCTION=usage
gatherSparkSubmitOpts "$@"

function main() {
if $cygwin; then
# Workaround for issue involving JLine and Cygwin
# (see http://sourceforge.net/p/jline/bugs/40/).
Expand All @@ -46,11 +54,11 @@ function main(){
# (see https://github.com/sbt/sbt/issues/562).
stty -icanon min 1 -echo > /dev/null 2>&1
export SPARK_SUBMIT_OPTS="$SPARK_SUBMIT_OPTS -Djline.terminal=unix"
$FWDIR/bin/spark-submit --class org.apache.spark.repl.Main spark-shell "$@"
$FWDIR/bin/spark-submit --class org.apache.spark.repl.Main "${SUBMISSION_OPTS[@]}" spark-shell "${APPLICATION_OPTS[@]}"
stty icanon echo > /dev/null 2>&1
else
export SPARK_SUBMIT_OPTS
$FWDIR/bin/spark-submit --class org.apache.spark.repl.Main spark-shell "$@"
$FWDIR/bin/spark-submit --class org.apache.spark.repl.Main "${SUBMISSION_OPTS[@]}" spark-shell "${APPLICATION_OPTS[@]}"
fi
}

Expand Down
2 changes: 1 addition & 1 deletion bin/spark-shell.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,4 @@ rem

set SPARK_HOME=%~dp0..

cmd /V /E /C %SPARK_HOME%\bin\spark-submit.cmd spark-shell --class org.apache.spark.repl.Main %*
cmd /V /E /C %SPARK_HOME%\bin\spark-submit.cmd --class org.apache.spark.repl.Main %* spark-shell
66 changes: 62 additions & 4 deletions bin/spark-sql
Original file line number Diff line number Diff line change
Expand Up @@ -23,14 +23,72 @@
# Enter posix mode for bash
set -o posix

CLASS="org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver"

# Figure out where Spark is installed
FWDIR="$(cd `dirname $0`/..; pwd)"

if [[ "$@" = *--help ]] || [[ "$@" = *-h ]]; then
echo "Usage: ./sbin/spark-sql [options]"
function usage {
echo "Usage: ./bin/spark-sql [options] [cli option]"
pattern="usage"
pattern+="\|Spark assembly has been built with Hive"
pattern+="\|NOTE: SPARK_PREPEND_CLASSES is set"
pattern+="\|Spark Command: "
pattern+="\|--help"
pattern+="\|======="

$FWDIR/bin/spark-submit --help 2>&1 | grep -v Usage 1>&2
echo
echo "CLI options:"
$FWDIR/bin/spark-class $CLASS --help 2>&1 | grep -v "$pattern" 1>&2
}

function ensure_arg_number {
arg_number=$1
at_least=$2

if [[ $arg_number -lt $at_least ]]; then
usage
exit 1
fi
}

if [[ "$@" = --help ]] || [[ "$@" = -h ]]; then
usage
exit 0
fi

CLASS="org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver"
exec "$FWDIR"/bin/spark-submit --class $CLASS spark-internal $@
CLI_ARGS=()
SUBMISSION_ARGS=()

while (($#)); do
case $1 in
-d | --define | --database | -f | -h | --hiveconf | --hivevar | -i | -p)
ensure_arg_number $# 2
CLI_ARGS+=("$1"); shift
CLI_ARGS+=("$1"); shift
;;

-e)
ensure_arg_number $# 2
CLI_ARGS+=("$1"); shift
CLI_ARGS+=("$1"); shift
;;

-s | --silent)
CLI_ARGS+=("$1"); shift
;;

-v | --verbose)
# Both SparkSubmit and SparkSQLCLIDriver recognizes -v | --verbose
CLI_ARGS+=("$1")
SUBMISSION_ARGS+=("$1"); shift
;;

*)
SUBMISSION_ARGS+=("$1"); shift
;;
esac
done

exec "$FWDIR"/bin/spark-submit --class $CLASS "${SUBMISSION_ARGS[@]}" spark-internal "${CLI_ARGS[@]}"
59 changes: 59 additions & 0 deletions bin/utils.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
#!/usr/bin/env bash

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# Gather all all spark-submit options into SUBMISSION_OPTS
function gatherSparkSubmitOpts() {

if [ -z "$SUBMIT_USAGE_FUNCTION" ]; then
echo "Function for printing usage of $0 is not set." 1>&2
echo "Please set usage function to shell variable 'SUBMIT_USAGE_FUNCTION' in $0" 1>&2
exit 1
fi

# NOTE: If you add or remove spark-sumbmit options,
# modify NOT ONLY this script but also SparkSubmitArgument.scala
SUBMISSION_OPTS=()
APPLICATION_OPTS=()
while (($#)); do
case "$1" in
--master | --deploy-mode | --class | --name | --jars | --py-files | --files | \
--conf | --properties-file | --driver-memory | --driver-java-options | \
--driver-library-path | --driver-class-path | --executor-memory | --driver-cores | \
--total-executor-cores | --executor-cores | --queue | --num-executors | --archives)
if [[ $# -lt 2 ]]; then
"$SUBMIT_USAGE_FUNCTION"
exit 1;
fi
SUBMISSION_OPTS+=("$1"); shift
SUBMISSION_OPTS+=("$1"); shift
;;

--verbose | -v | --supervise)
SUBMISSION_OPTS+=("$1"); shift
;;

*)
APPLICATION_OPTS+=("$1"); shift
;;
esac
done

export SUBMISSION_OPTS
export APPLICATION_OPTS
}
100 changes: 0 additions & 100 deletions core/src/main/java/org/apache/spark/network/netty/FileClient.java

This file was deleted.

Loading