Skip to content

pull latest from apache spark #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 259 commits into from
Apr 5, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
259 commits
Select commit Hold shift + click to select a range
dcaa016
[SPARK-13897][SQL] RelationalGroupedDataset and KeyValueGroupedDataset
rxin Mar 19, 2016
d630a20
[SPARK-10680][TESTS] Increase 'connectionTimeout' to make RequestTime…
zsxwing Mar 19, 2016
811a524
[SPARK-12182][ML] Distributed binning for trees in spark.ml
sethah Mar 20, 2016
454a00d
[SPARK-13993][PYSPARK] Add pyspark Rformula/RforumlaModel save/load
yinxusen Mar 20, 2016
f58319a
[SPARK-14019][SQL] Remove noop SortOrder in Sort
gatorsmile Mar 21, 2016
e474088
[SPARK-13764][SQL] Parse modes in JSON data source
HyukjinKwon Mar 21, 2016
20fd254
[SPARK-14011][CORE][SQL] Enable `LineLength` Java checkstyle rule
dongjoon-hyun Mar 21, 2016
761c2d1
[MINOR][DOCS] Add proper periods and spaces for CLI help messages and…
dongjoon-hyun Mar 21, 2016
c35c60f
[SPARK-14028][STREAMING][KINESIS][TESTS] Remove deprecated methods; f…
lw-lin Mar 21, 2016
2c5b18f
[SPARK-12789][SQL] Support Order By Ordinal in SQL
gatorsmile Mar 21, 2016
17a3f00
[SPARK-14000][SQL] case class with a tuple field can't work in Dataset
cloud-fan Mar 21, 2016
df61fbd
[SPARK-13986][CORE][MLLIB] Remove `DeveloperApi`-annotations for non-…
dongjoon-hyun Mar 21, 2016
a2a9078
[SPARK-14039][SQL][MINOR] make SubqueryHolder an inner class
cloud-fan Mar 21, 2016
060a28c
[SPARK-13826][SQL] Ad-hoc Dataset API ScalaDoc fixes
liancheng Mar 21, 2016
43ebf7a
[SPARK-13456][SQL] fix creating encoders for case classes defined in …
cloud-fan Mar 21, 2016
5d8de16
[SPARK-14004][SQL] NamedExpressions should have at most one qualifier
liancheng Mar 21, 2016
9b4e15b
[SPARK-14007] [SQL] Manage the memory used by hash map in shuffled ha…
Mar 21, 2016
f35df7d
[SPARK-13805] [SQL] Generate code that get a value in each column fro…
kiszk Mar 21, 2016
f3717fc
[SPARK-14004][FOLLOW-UP] Implementations of NonSQLExpression should n…
cloud-fan Mar 21, 2016
1af8de2
[SPARK-13019][DOCS] Replace example code in mllib-statistics.md using…
keypointt Mar 21, 2016
5e86e92
[SPARK-13916][SQL] Add a metric to WholeStageCodegen to measure durat…
nongli Mar 21, 2016
b3e5af6
[SPARK-13898][SQL] Merge DatasetHolder and DataFrameHolder
rxin Mar 22, 2016
b5f1ab7
[SPARK-13990] Automatically pick serializer when caching RDDs
JoshRosen Mar 22, 2016
3f49e07
[SPARK-13320][SQL] Support Star in CreateStruct/CreateArray and Error…
gatorsmile Mar 22, 2016
43ef1e5
Revert "[SPARK-13019][DOCS] Replace example code in mllib-statistics.…
mengxr Mar 22, 2016
7299961
[SPARK-14016][SQL] Support high-precision decimals in vectorized parq…
sameeragarwal Mar 22, 2016
8014a51
[SPARK-13883][SQL] Parquet Implementation of FileFormat.buildReader
marmbrus Mar 22, 2016
8193a26
[SPARK-14058][PYTHON] Incorrect docstring in Window.order
zero323 Mar 22, 2016
14464ca
[SPARK-14038][SQL] enable native view by default
cloud-fan Mar 22, 2016
f2e855f
[SPARK-13473][SQL] Simplifies PushPredicateThroughProject
liancheng Mar 22, 2016
4e09a0d
[SPARK-13953][SQL] Specifying the field name for corrupted record via…
HyukjinKwon Mar 22, 2016
0ce0163
[SPARK-13774][SQL] - Improve error message for non-existent paths and…
skambha Mar 22, 2016
c632bdc
[SPARK-14029][SQL] Improve BooleanSimplification optimization by impl…
dongjoon-hyun Mar 22, 2016
caea152
[SPARK-13985][SQL] Deterministic batches with ids
marmbrus Mar 22, 2016
297c202
[SPARK-14063][SQL] SQLContext.range should return Dataset[java.lang.L…
rxin Mar 22, 2016
7e3423b
[SPARK-13951][ML][PYTHON] Nested Pipeline persistence
jkbradley Mar 22, 2016
b2b1ad7
[SPARK-14060][SQL] Move StringToColumn implicit class into SQLImplicits
rxin Mar 22, 2016
d6dc12e
[SPARK-13449] Naive Bayes wrapper in SparkR
yinxusen Mar 22, 2016
d16710b
[HOTFIX][SQL] Add a timeout for 'cq.stop'
zsxwing Mar 22, 2016
4700adb
[SPARK-13806] [SQL] fix rounding mode of negative float/double
Mar 22, 2016
0d51b60
[SPARK-14072][CORE] Show JVM/OS version information when we run a ben…
kiszk Mar 23, 2016
75dc296
[SPARK-13401][SQL][TESTS] Fix SQL test warnings.
yongtang Mar 23, 2016
1a22cf1
[MINOR][SQL][DOCS] Update `sql/README.md` and remove some unused impo…
dongjoon-hyun Mar 23, 2016
926a93e
[SPARK-14088][SQL] Some Dataset API touch-up
rxin Mar 23, 2016
abacf5f
[HOTFIX][SQL] Don't stop ContinuousQuery in quietly
zsxwing Mar 23, 2016
4d955cd
[SPARK-14035][MLLIB] Make error message more verbose for mllib NaiveB…
jkbradley Mar 23, 2016
7d11750
[SPARK-14074][SPARKR] Specify commit sha1 ID when using install_githu…
Mar 23, 2016
cde086c
[SPARK-13817][SQL][MINOR] Renames Dataset.newDataFrame to Dataset.ofRows
liancheng Mar 23, 2016
6ce008b
[SPARK-13549][SQL] Refactor the Optimizer Rule CollapseProject
gatorsmile Mar 23, 2016
3de24ae
[SPARK-14075] Refactor MemoryStore to be testable independent of Bloc…
JoshRosen Mar 23, 2016
48ee16d
[SPARK-14055] writeLocksByTask need to be update when removeBlock
Earne Mar 23, 2016
30bdb5c
[SPARK-13068][PYSPARK][ML] Type conversion for Pyspark params
sethah Mar 23, 2016
02d9c35
[SPARK-14092] [SQL] move shouldStop() to end of while loop
Mar 23, 2016
0a64294
[SPARK-14015][SQL] Support TimestampType in vectorized parquet reader
sameeragarwal Mar 23, 2016
8c82688
[SPARK-13809][SQL] State store for streaming aggregations
tdas Mar 23, 2016
919bf32
[SPARK-13325][SQL] Create a 64-bit hashcode expression
hvanhovell Mar 23, 2016
6bc4be6
[SPARK-14078] Streaming Parquet Based FileSink
marmbrus Mar 23, 2016
5dfc019
[SPARK-14014][SQL] Replace existing catalog with SessionCatalog
Mar 23, 2016
69bc2c1
[SPARK-13952][ML] Add random seed to GBT
sethah Mar 23, 2016
de4e48b
[SPARK-14025][STREAMING][WEBUI] Fix streaming job descriptions on the…
lw-lin Mar 23, 2016
f42eaf4
[SPARK-14085][SQL] Star Expansion for Hash
gatorsmile Mar 24, 2016
cf823be
[SPARK-12183][ML][MLLIB] Remove mllib tree implementation, and wrap s…
jkbradley Mar 24, 2016
c44d140
Revert "[SPARK-14014][SQL] Replace existing catalog with SessionCatalog"
Mar 24, 2016
01849da
[SPARK-14110][CORE] PipedRDD to print the command ran on non zero exit
tejasapatil Mar 24, 2016
1803bf6
Fix typo in ALS.scala
jbochi Mar 24, 2016
048a759
[SPARK-14030][MLLIB] Add parameter check to MLLIB
zhengruifeng Mar 24, 2016
dd9ca7b
[SPARK-13019][DOCS] fix for scala-2.10 build: Replace example code in…
keypointt Mar 24, 2016
5519760
[SPARK-2208] Fix for local metrics tests can fail on fast machines
joan38 Mar 24, 2016
342079d
Revert "[SPARK-2208] Fix for local metrics tests can fail on fast mac…
srowen Mar 24, 2016
d283223
[SPARK-13017][DOCS] Replace example code in mllib-feature-extraction.…
keypointt Mar 24, 2016
2cf46d5
[SPARK-11871] Add save/load for MLPC
yinxusen Mar 24, 2016
fdd460f
[SPARK-13980] Incrementally serialize blocks while unrolling them in …
JoshRosen Mar 25, 2016
5850977
[SPARK-14107][PYSPARK][ML] Add seed as named argument to GBTs in pyspark
sethah Mar 25, 2016
0874ff3
[SPARK-13949][ML][PYTHON] PySpark ml DecisionTreeClassifier, Regresso…
GayathriMurali Mar 25, 2016
05f652d
[SPARK-13957][SQL] Support Group By Ordinal in SQL
gatorsmile Mar 25, 2016
13cbb2d
[SPARK-13010][ML][SPARKR] Implement a simple wrapper of AFTSurvivalRe…
yanboliang Mar 25, 2016
3619fec
[SPARK-14142][SQL] Replace internal use of unionAll with union
rxin Mar 25, 2016
1c70b76
[SPARK-14145][SQL] Remove the untyped version of Dataset.groupByKey
rxin Mar 25, 2016
20ddf5f
[SPARK-14014][SQL] Integrate session catalog (attempt #2)
Mar 25, 2016
70a6f0b
[SPARK-14149] Log exceptions in tryOrIOException
rxin Mar 25, 2016
e9b6e7d
[SPARK-13456][SQL][FOLLOW-UP] lazily generate the outer pointer for c…
cloud-fan Mar 25, 2016
55a6057
[SPARK-13887][PYTHON][TRIVIAL][BUILD] Make lint-python script fail fast
holdenk Mar 25, 2016
6603d9f
[SPARK-13919] [SQL] fix column pruning through filter
Mar 25, 2016
43b15e0
[SPARK-14061][SQL] implement CreateMap
cloud-fan Mar 25, 2016
b5f8c36
[SPARK-14144][SQL] Explicitly identify/catch UnsupportedOperationExce…
sameeragarwal Mar 25, 2016
11fa874
[SQL][HOTFIX] Fix flakiness in StateStoreRDDSuite
tdas Mar 25, 2016
ca00335
[SPARK-12443][SQL] encoderFor should support Decimal
viirya Mar 25, 2016
afd0deb
[SPARK-14137] [SPARK-14150] [SQL] Infer IsNotNull constraints from no…
sameeragarwal Mar 25, 2016
b554b3c
[SPARK-14131][SQL] Add a workaround for HADOOP-10622 to fix DataFrame…
zsxwing Mar 25, 2016
ff7cc45
[SPARK-14091][CORE] Improve performance of SparkContext.getCallSite()
rbalamohan Mar 25, 2016
54d13be
[SPARK-14159][ML] Fixed bug in StringIndexer + related issue in RFormula
jkbradley Mar 25, 2016
24587ce
[SPARK-14073][STREAMING][TEST-MAVEN] Move flume back to Spark
zsxwing Mar 26, 2016
13945dd
[SPARK-14109][SQL] Fix HDFSMetadataLog to fallback from FileContext t…
tdas Mar 26, 2016
d23ad7c
[SPARK-13874][DOC] Remove docs of streaming-akka, streaming-zeromq, s…
zsxwing Mar 26, 2016
1808465
[MINOR] Fix newly added java-lint errors
dongjoon-hyun Mar 26, 2016
62a85eb
[SPARK-14089][CORE][MLLIB] Remove methods that has been deprecated si…
lw-lin Mar 26, 2016
a91784f
[SPARK-13973][PYSPARK] ipython notebook` is going away
rekhajoshm Mar 26, 2016
bd94ea4
[SPARK-14175][SQL] whole stage codegen interface refactor
Mar 26, 2016
20c0bcd
[SPARK-14135] Add off-heap storage memory bookkeeping support to Memo…
JoshRosen Mar 26, 2016
8989d3a
[SPARK-14161][SQL] Native Parsing for DDL Command Drop Database
gatorsmile Mar 26, 2016
b547de8
[SPARK-14116][SQL] Implements buildReader() for ORC data source
liancheng Mar 26, 2016
bc925b7
[SPARK-14157][SQL] Parse Drop Function DDL command
viirya Mar 27, 2016
a01b6a9
[SPARK-14177][SQL] Native Parsing for DDL Command "Describe Database"…
gatorsmile Mar 27, 2016
cfcca73
[MINOR][SQL] Fix substr/substring testcases.
dongjoon-hyun Mar 27, 2016
0f02a5c
[MINOR][MLLIB] Remove TODO comment DecisionTreeModel.scala
dongjoon-hyun Mar 27, 2016
8ef4937
[SPARK-10691][ML] Make LogisticRegressionModel, LinearRegressionModel…
jkbradley Mar 28, 2016
aac13fb
[SPARK-14185][SQL][MINOR] Make indentation of debug log for generated…
sarutak Mar 28, 2016
7b84154
[SPARK-12494][MLLIB] Array out of bound Exception in KMeans Yarn Mode
srowen Mar 28, 2016
b66aa90
[SPARK-14102][CORE] Block `reset` command in SparkShell
dongjoon-hyun Mar 28, 2016
c838829
[SPARK-14187][MLLIB] Fix incorrect use of binarySearch in SparseMatrix
Mar 28, 2016
68c0c46
[SPARK-13742] [CORE] Add non-iterator interface to RandomSampler
viirya Mar 28, 2016
40984f6
[SPARK-12792] [SPARKR] Refactor RRDD to support R UDF.
Mar 28, 2016
e5a1b30
Revert "[SPARK-12792] [SPARKR] Refactor RRDD to support R UDF."
davies Mar 28, 2016
4a7636f
[SPARK-13844] [SQL] Generate better code for filters with a non-nulla…
kiszk Mar 28, 2016
1528ff4
[SPARK-14156][SQL] Use executedPlan in HiveComparisonTest for the mes…
viirya Mar 28, 2016
600c0b6
[SPARK-13713][SQL] Migrate parser from ANTLR3 to ANTLR4
hvanhovell Mar 28, 2016
d7b58f1
[SPARK-14052] [SQL] build a BytesToBytesMap directly in HashedRelation
Mar 28, 2016
7007f72
[SPARK-13713][SQL][TEST-MAVEN] Add Antlr4 maven plugin.
yhuai Mar 28, 2016
ff3bea3
[SPARK-13622][YARN] Issue creating level db for YARN shuffle service
ashangit Mar 28, 2016
39f743a
[SPARK-14202] [PYTHON] Use generator expression instead of list comp …
zero323 Mar 28, 2016
8c11d1a
[SPARK-11893] Model export/import for spark.ml: TrainValidationSplit
yinxusen Mar 28, 2016
328c711
[SPARK-14086][SQL] Add DDL commands to ANTLR4 parser
hvanhovell Mar 28, 2016
34c0638
[SPARK-14180][CORE] Fix a deadlock in CoarseGrainedExecutorBackend Sh…
zsxwing Mar 28, 2016
eebc8c1
[SPARK-13923][SPARK-14014][SQL] Session catalog follow-ups
Mar 28, 2016
b783649
[SPARK-14155][SQL] Hide UserDefinedType interface in Spark 2.0
rxin Mar 28, 2016
2f98ee6
[SPARK-14169][CORE] Add UninterruptibleThread
zsxwing Mar 28, 2016
27aab80
[SPARK-14013][SQL] Proper temp function support in catalog
Mar 28, 2016
a916d2a
[SPARK-14119][SPARK-14120][SPARK-14122][SQL] Throw exception on unsup…
Mar 28, 2016
ad9e3d5
[SPARK-13845][CORE] Using onBlockUpdated to replace onTaskEnd aviodin…
jeanlyn Mar 28, 2016
2bc7c96
[SPARK-13447][YARN][CORE] Clean the stale states for AM failure and r…
jerryshao Mar 29, 2016
289257c
[SPARK-14219][GRAPHX] Fix `pickRandomVertex` not to fall into infinit…
dongjoon-hyun Mar 29, 2016
38326ca
[SPARK-14205][SQL] remove trait Queryable
cloud-fan Mar 29, 2016
27d4ef0
[SPARK-14213][SQL] Migrate HiveQl parsing to ANTLR4 parser
hvanhovell Mar 29, 2016
4a55c33
[SPARK-13981][SQL] Defer evaluating variables within Filter operator.
nongli Mar 29, 2016
a180286
[SPARK-14210] [SQL] Add a metric for time spent in scans.
nongli Mar 29, 2016
d3638d7
[SPARK-12792] [SPARKR] Refactor RRDD to support R UDF.
Mar 29, 2016
f6066b0
[SPARK-11730][ML] Add feature importances for GBTs.
sethah Mar 29, 2016
63b200e
[SPARK-14071][PYSPARK][ML] Change MLWritable.write to be a property
wangmiao1981 Mar 29, 2016
83775bc
[SPARK-14158][SQL] implement buildReader for json data source
cloud-fan Mar 29, 2016
425bcf6
[SPARK-13963][ML] Adding binary toggle param to HashingTF
BryanCutler Mar 29, 2016
a632bb5
[SPARK-14208][SQL] Renames spark.sql.parquet.fileScan
liancheng Mar 29, 2016
d2a819a
[SPARK-14154][MLLIB] Simplify the implementation for Kolmogorov–Smirn…
hhbyyh Mar 29, 2016
15c0b00
[SPARK-14232][WEBUI] Fix event timeline display issue when an executo…
carsonwang Mar 29, 2016
d26c429
[SPARK-10570][CORE] Add version info to json api
jodersky Mar 29, 2016
d612228
[MINOR][SQL] Fix typos by replacing 'much' with 'match'.
dongjoon-hyun Mar 29, 2016
838cb45
[MINOR][SQL] Fix exception message to print string-array correctly.
dongjoon-hyun Mar 29, 2016
e58c4cb
[SPARK-14227][SQL] Add method for printing out generated code for deb…
ericl Mar 29, 2016
a7a93a1
[SPARK-14215] [SQL] [PYSPARK] Support chained Python UDFs
Mar 29, 2016
366cac6
[SPARK-14225][SQL] Cap the length of toCommentSafeString at 128 chars
sameeragarwal Mar 29, 2016
e1f6845
[SPARK-12181] Check Cached unaligned-access capability before using U…
tedyu Mar 30, 2016
b66b97c
[SPARK-14124][SQL] Implement Database-related DDL Commands
gatorsmile Mar 30, 2016
7320f9b
[SPARK-14254][CORE] Add logs to help investigate the network performance
zsxwing Mar 30, 2016
816f359
[SPARK-14114][SQL] implement buildReader for text data source
cloud-fan Mar 30, 2016
d46c71b
[SPARK-14268][SQL] rename toRowExpressions and fromRowExpression to s…
cloud-fan Mar 30, 2016
bdabfd4
[SPARK-13955][YARN] Also look for Spark jars in the build directory.
Mar 30, 2016
529d6ce
[SPARK-14181] TrainValidationSplit should have HasSeed
yinxusen Mar 30, 2016
5dc948e
[MINOR][ML] Fix the wrong param name of LDA topicDistributionCol
yanboliang Mar 30, 2016
f301df3
[SPARK-14152][ML][PYSPARK] MultilayerPerceptronClassifier supports sa…
yanboliang Mar 30, 2016
ca45861
[SPARK-11507][MLLIB] add compact in Matrices fromBreeze
hhbyyh Mar 30, 2016
dadf013
[SPARK-14259][SQL] Add a FileSourceStrategy option for limiting #file…
maropu Mar 30, 2016
258a243
[SPARK-14282][SQL] CodeFormatter should handle oneline comment with /…
dongjoon-hyun Mar 30, 2016
da54abf
[SPARK-14081][SQL] - Preserve DataFrame column types when filling nulls.
Mar 30, 2016
26445c2
[SPARK-14206][SQL] buildReader() implementation for CSV
liancheng Mar 31, 2016
a9b93e0
[SPARK-14211][SQL] Remove ANTLR3 based parser
hvanhovell Mar 31, 2016
208fff3
[SPARK-14164][MLLIB] Improve input layer validation of MultilayerPerc…
dongjoon-hyun Mar 31, 2016
3b3cc76
[SPARK-14062][YARN] Fix log4j and upload metrics.properties automatic…
jerryshao Mar 31, 2016
a0a1991
[SPARK-13782][ML] Model export/import for spark.ml: BisectingKMeans
hhbyyh Mar 31, 2016
8b207f3
[SPARK-11892][ML] Model export/import for spark.ml: OneVsRest
yinxusen Mar 31, 2016
8d62072
[SPARK-14263][SQL] Benchmark Vectorized HashMap for GroupBy Aggregates
sameeragarwal Mar 31, 2016
3586929
[SPARK-14278][SQL] Initialize columnar batch with proper memory mode
sameeragarwal Mar 31, 2016
ac1b8b3
[SPARK-13796] Redirect error message to logWarning
nishkamravi2 Mar 31, 2016
446c45b
[SPARK-14182][SQL] Parse DDL Command: Alter View
gatorsmile Mar 31, 2016
8a333d2
[SPARK-14243][CORE] update task metrics when removing blocks
jeanlyn Mar 31, 2016
4d93b65
[Docs] Update monitoring.md to accurately describe the history server
Mar 31, 2016
0abee53
[SPARK-14069][SQL] Improve SparkStatusTracker to also track executor …
cloud-fan Mar 31, 2016
10508f3
[SPARK-11327][MESOS] Dispatcher does not respect all args from the Su…
jayv Mar 31, 2016
3cfbeb7
[SPARK-13710][SHELL][WINDOWS] Fix jline dependency on Windows
michellemay Mar 31, 2016
e785402
[SPARK-14304][SQL][TESTS] Fix tests that don't create temp files in t…
zsxwing Mar 31, 2016
b11887c
[SPARK-14264][PYSPARK][ML] Add feature importance for GBTs in pyspark
sethah Mar 31, 2016
a7af6cd
[SPARK-14281][TESTS] Fix java8-tests and simplify their build
JoshRosen Mar 31, 2016
8de201b
[SPARK-14277][CORE] Upgrade Snappy Java to 1.1.2.4
Mar 31, 2016
f0afafd
[SPARK-14267] [SQL] [PYSPARK] execute multiple Python UDFs within sin…
Mar 31, 2016
96941b1
[SPARK-14242][CORE][NETWORK] avoid copy in compositeBuffer for frame …
liyezhang556520 Apr 1, 2016
1b07063
[SPARK-14295][SPARK-14274][SQL] Implements buildReader() for LibSVM
liancheng Apr 1, 2016
26867eb
[SPARK-11262][ML] Unit test for gradient, loss layers, memory managem…
avulanov Apr 1, 2016
22249af
[SPARK-14303][ML][SPARKR] Define and use KMeansWrapper for SparkR::km…
yanboliang Apr 1, 2016
3715ecd
[SPARK-14295][MLLIB][HOTFIX] Fixes Scala 2.10 compilation failure
liancheng Apr 1, 2016
0b04f8f
[SPARK-14184][SQL] Support native execution of SHOW DATABASE command …
dilipbiswal Apr 1, 2016
a471c7f
[SPARK-14133][SQL] Throws exception for unsupported create/drop/alter…
sureshthalamati Apr 1, 2016
58e6bc8
[MINOR] [SQL] Update usage of `debug` by removing `typeCheck` and add…
dongjoon-hyun Apr 1, 2016
8ba2b7f
[SPARK-12343][YARN] Simplify Yarn client and client argument
jerryshao Apr 1, 2016
381358f
[SPARK-14305][ML][PYSPARK] PySpark ml.clustering BisectingKMeans supp…
yanboliang Apr 1, 2016
df68beb
[SPARK-13995][SQL] Extract correct IsNotNull constraints for Expression
viirya Apr 1, 2016
a884daa
[SPARK-14191][SQL] Remove invalid Expand operator constraints
viirya Apr 1, 2016
1e88615
[SPARK-14070][SQL] Use ORC data source for SQL queries on ORC tables
tejasapatil Apr 1, 2016
1b829ce
[SPARK-14160] Time Windowing functions for Datasets
brkyvz Apr 1, 2016
3e991db
[SPARK-13674] [SQL] Add wholestage codegen support to Sample
viirya Apr 1, 2016
bd7b91c
[SPARK-12864][YARN] initialize executorIdCounter after ApplicationMas…
zhonghaihua Apr 1, 2016
e41acb7
[SPARK-13992] Add support for off-heap caching
JoshRosen Apr 1, 2016
0b7d496
[SPARK-14316][SQL] StateStoreCoordinator should extend ThreadSafeRpcE…
zsxwing Apr 1, 2016
0fc4aaa
[SPARK-14255][SQL] Streaming Aggregation
marmbrus Apr 1, 2016
c16a396
[SPARK-13825][CORE] Upgrade to Scala 2.11.8
jaceklaskowski Apr 1, 2016
19f32f2
[SPARK-12857][STREAMING] Standardize "records" and "events" on "records"
lw-lin Apr 1, 2016
abc6c42
[SPARK-13241][WEB UI] Added long values for dates in ApplicationAttem…
ajbozarth Apr 1, 2016
36e8fb8
[SPARK-7425][ML] spark.ml Predictor should support other numeric type…
BenFradet Apr 2, 2016
4fc35e6
[SPARK-14308][ML][MLLIB] Remove unused mllib tree classes and move pr…
sethah Apr 2, 2016
27e71a2
[SPARK-14244][SQL] Don't use SizeBasedWindowFunction.n created on exe…
liancheng Apr 2, 2016
877dc71
[SPARK-14138] [SQL] [MASTER] Fix generated SpecificColumnarIterator c…
kiszk Apr 2, 2016
fa1af0a
[SPARK-14251][SQL] Add SQL command for printing out generated code fo…
dongjoon-hyun Apr 2, 2016
f414154
[SPARK-14285][SQL] Implement common type-safe aggregate functions
rxin Apr 2, 2016
d7982a3
[MINOR][SQL] Fix comments styl and correct several styles and nits in…
HyukjinKwon Apr 2, 2016
67d7535
[HOTFIX] Fix compilation break.
rxin Apr 2, 2016
06694f1
[MINOR] Typo fixes
jaceklaskowski Apr 2, 2016
a3e2935
[HOTFIX] Disable StateStoreSuite.maintenance
rxin Apr 2, 2016
f705037
[SPARK-14338][SQL] Improve `SimplifyConditionals` rule to handle `nul…
dongjoon-hyun Apr 3, 2016
4a6e78a
[MINOR][DOCS] Use multi-line JavaDoc comments in Scala code.
dongjoon-hyun Apr 3, 2016
03d130f
[SPARK-14342][CORE][DOCS][TESTS] Remove straggler references to Tachyon
lw-lin Apr 3, 2016
1cf7018
[SPARK-14056] Appends s3 specific configurations and spark.hadoop con…
Apr 3, 2016
c2f25b1
[SPARK-13996] [SQL] Add more not null attributes for Filter codegen
viirya Apr 3, 2016
7be4620
[HOTFIX] Fix Scala 2.10 compilation
rxin Apr 3, 2016
2262a93
[SPARK-14231] [SQL] JSON data source infers floating-point values as …
HyukjinKwon Apr 3, 2016
1f0c5dc
[SPARK-14350][SQL] EXPLAIN output should be in a single cell
dongjoon-hyun Apr 3, 2016
c238cd0
[SPARK-14341][SQL] Throw exception on unsupported create / drop macro…
Apr 3, 2016
9023015
[SPARK-14163][CORE] SumEvaluator and countApprox cannot reliably hand…
mtustin-handy Apr 4, 2016
3f749f7
[SPARK-14355][BUILD] Fix typos in Exception/Testcase/Comments and sta…
dongjoon-hyun Apr 4, 2016
76f3c73
[SPARK-14356] Update spark.sql.execution.debug to work on Datasets
mateiz Apr 4, 2016
0340b3d
[SPARK-14360][SQL] QueryExecution.debug.codegen() to dump codegen
rxin Apr 4, 2016
7454253
[SPARK-14137] [SQL] Cleanup hash join
Apr 4, 2016
89f3bef
[SPARK-13784][ML] Persistence for RandomForestClassifier, RandomFores…
jkbradley Apr 4, 2016
855ed44
[SPARK-14176][SQL] Add DataFrameWriter.trigger to set the stream batc…
zsxwing Apr 4, 2016
5743c64
[SPARK-12981] [SQL] extract Pyhton UDF in physical plan
Apr 4, 2016
27dad6f
[SPARK-14364][SPARK] HeartbeatReceiver object should be private
rxin Apr 4, 2016
7143904
[SPARK-14358] Change SparkListener from a trait to an abstract class
rxin Apr 4, 2016
cc70f17
[SPARK-14334] [SQL] add toLocalIterator for Dataset/DataFrame
Apr 4, 2016
400b2f8
[SPARK-14259] [SQL] Merging small files together based on the cost of…
Apr 4, 2016
24d7d2e
[SPARK-13579][BUILD] Stop building the main Spark assembly.
Apr 4, 2016
a172e11
[SPARK-14366] Remove sbt-idea plugin
lresende Apr 4, 2016
7201f03
[SPARK-12425][STREAMING] DStream union optimisation
gpoulin Apr 5, 2016
ba24d1e
[SPARK-14287] isStreaming method for Dataset
brkyvz Apr 5, 2016
8f50574
[SPARK-14386][ML] Changed spark.ml ensemble trees methods to return c…
jkbradley Apr 5, 2016
7db5624
[SPARK-14368][PYSPARK] Support python.spark.worker.memory with upper-…
yongtang Apr 5, 2016
0646230
[SPARK-14359] Create built-in functions for typed aggregates in Java
ericl Apr 5, 2016
2715bc6
[SPARK-14348][SQL] Support native execution of SHOW TBLPROPERTIES com…
dilipbiswal Apr 5, 2016
7807173
[SPARK-14349][SQL] Issue Error Messages for Unsupported Operators/DML…
gatorsmile Apr 5, 2016
d356901
[SPARK-14284][ML] KMeansSummary deprecating size; adding clusterSizes
shallys Apr 5, 2016
e4bd504
[SPARK-14397][WEBUI] <html> and <body> tags are nested in LogPage
sarutak Apr 5, 2016
f77f11c
[SPARK-14345][SQL] Decouple deserializer expression resolution from O…
cloud-fan Apr 5, 2016
463bac0
[SPARK-14257][SQL] Allow multiple continuous queries to be started fr…
zsxwing Apr 5, 2016
bc36df1
[SPARK-13063][YARN] Make the SPARK YARN STAGING DIR as configurable
Apr 5, 2016
72544d6
[SPARK-14123][SPARK-14384][SQL] Handle CreateFunction/DropFunction
yhuai Apr 5, 2016
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -238,6 +238,7 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(BSD 3 Clause) netlib core (com.github.fommil.netlib:core:1.1.2 - https://github.com/fommil/netlib-java/core)
(BSD 3 Clause) JPMML-Model (org.jpmml:pmml-model:1.2.7 - https://github.com/jpmml/jpmml-model)
(BSD License) AntLR Parser Generator (antlr:antlr:2.7.7 - http://www.antlr.org/)
(BSD License) ANTLR 4.5.2-1 (org.antlr:antlr4:4.5.2-1 - http://wwww.antlr.org/)
(BSD licence) ANTLR ST4 4.0.4 (org.antlr:ST4:4.0.4 - http://www.stringtemplate.org)
(BSD licence) ANTLR StringTemplate (org.antlr:stringtemplate:3.2.1 - http://www.stringtemplate.org)
(BSD License) Javolution (javolution:javolution:5.5.1 - http://javolution.org)
Expand Down
1 change: 0 additions & 1 deletion NOTICE
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,6 @@ Eclipse Public License 1.0

The following components are provided under the Eclipse Public License 1.0. See project link for details.

(Eclipse Public License - Version 1.0) mqtt-client (org.eclipse.paho:mqtt-client:0.4.0 - http://www.eclipse.org/paho/mqtt-client)
(Eclipse Public License v1.0) Eclipse JDT Core (org.eclipse.jdt:core:3.1.1 - http://www.eclipse.org/jdt/)

========================================================================
Expand Down
10 changes: 5 additions & 5 deletions R/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ To set other options like driver memory, executor memory etc. you can pass in th
If you wish to use SparkR from RStudio or other R frontends you will need to set some environment variables which point SparkR to your Spark installation. For example
```
# Set this to where Spark is installed
Sys.setenv(SPARK_HOME="/Users/shivaram/spark")
Sys.setenv(SPARK_HOME="/Users/username/spark")
# This line loads SparkR from the installed directory
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))
library(SparkR)
Expand All @@ -51,7 +51,7 @@ sc <- sparkR.init(master="local")

The [instructions](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark) for making contributions to Spark also apply to SparkR.
If you only make R file changes (i.e. no Scala changes) then you can just re-install the R package using `R/install-dev.sh` and test your changes.
Once you have made your changes, please include unit tests for them and run existing unit tests using the `run-tests.sh` script as described below.
Once you have made your changes, please include unit tests for them and run existing unit tests using the `R/run-tests.sh` script as described below.

#### Generating documentation

Expand All @@ -60,17 +60,17 @@ The SparkR documentation (Rd files and HTML files) are not a part of the source
### Examples, Unit tests

SparkR comes with several sample programs in the `examples/src/main/r` directory.
To run one of them, use `./bin/sparkR <filename> <args>`. For example:
To run one of them, use `./bin/spark-submit <filename> <args>`. For example:

./bin/sparkR examples/src/main/r/dataframe.R
./bin/spark-submit examples/src/main/r/dataframe.R

You can also run the unit-tests for SparkR by running (you need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first):

R -e 'install.packages("testthat", repos="http://cran.us.r-project.org")'
./R/run-tests.sh

### Running on YARN
The `./bin/spark-submit` and `./bin/sparkR` can also be used to submit jobs to YARN clusters. You will need to set YARN conf dir before doing so. For example on CDH you can run
The `./bin/spark-submit` can also be used to submit jobs to YARN clusters. You will need to set YARN conf dir before doing so. For example on CDH you can run
```
export YARN_CONF_DIR=/etc/hadoop/conf
./bin/spark-submit --master yarn examples/src/main/r/dataframe.R
Expand Down
4 changes: 3 additions & 1 deletion R/pkg/DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,9 @@ Depends:
R (>= 3.0),
methods,
Suggests:
testthat
testthat,
e1071,
survival
Description: R frontend for Spark
License: Apache License (== 2.0)
Collate:
Expand Down
4 changes: 3 additions & 1 deletion R/pkg/NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,9 @@ exportMethods("glm",
"predict",
"summary",
"kmeans",
"fitted")
"fitted",
"naiveBayes",
"survreg")

# Job group lifecycle management methods
export("setJobGroup",
Expand Down
8 changes: 8 additions & 0 deletions R/pkg/R/generics.R
Original file line number Diff line number Diff line change
Expand Up @@ -1175,3 +1175,11 @@ setGeneric("kmeans")
#' @rdname fitted
#' @export
setGeneric("fitted")

#' @rdname naiveBayes
#' @export
setGeneric("naiveBayes", function(formula, data, ...) { standardGeneric("naiveBayes") })

#' @rdname survreg
#' @export
setGeneric("survreg", function(formula, data, ...) { standardGeneric("survreg") })
Loading