Skip to content

Pulling functionality from apache spark #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 708 commits into from
Jun 22, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
708 commits
Select commit Hold shift + click to select a range
5cd6a63
[SQL] [TEST] [MINOR] Follow-up of PR #6493, use Guava API to ensure J…
liancheng Jun 3, 2015
c3f4c32
[SPARK-7387] [ML] [DOC] CrossValidator example code in Python
Jun 3, 2015
a86b3e9
[SPARK-7547] [ML] Scala Example code for ElasticNet
Jun 3, 2015
cafd505
[SPARK-7691] [SQL] Refactor CatalystTypeConverter to use type-specifi…
JoshRosen Jun 3, 2015
07c16cb
[SPARK-8053] [MLLIB] renamed scalingVector to scalingVec
jkbradley Jun 3, 2015
ccaa823
[MINOR] make the launcher project name consistent with others
WangTaoTheTonic Jun 3, 2015
43adbd5
[SPARK-8043] [MLLIB] [DOC] update NaiveBayes and SVM examples in doc
hhbyyh Jun 3, 2015
452eb82
[SPARK-8032] [PYSPARK] Make version checking for NumPy in MLlib more …
MechCoder Jun 3, 2015
ce320cb
[SPARK-8060] Improve DataFrame Python test coverage and documentation.
rxin Jun 3, 2015
d38cf21
[SPARK-7562][SPARK-6444][SQL] Improve error reporting for expression …
cloud-fan Jun 3, 2015
28dbde3
[SPARK-7983] [MLLIB] Add require for one-based indices in loadLibSVMFile
hhbyyh Jun 3, 2015
f1646e1
[SPARK-7973] [SQL] Increase the timeout of two CliSuite tests.
yhuai Jun 3, 2015
2c4d550
[SPARK-7801] [BUILD] Updating versions to SPARK 1.5.0
pwendell Jun 3, 2015
d053a31
[SPARK-7980] [SQL] Support SQLContext.range(end)
Jun 3, 2015
d2a86eb
[SPARK-7161] [HISTORY SERVER] Provide REST api to download event logs…
harishreedharan Jun 3, 2015
708c63b
[SPARK-8063] [SPARKR] Spark master URL conflict between MASTER env va…
Jun 3, 2015
939e4f3
[SPARK-8074] Parquet should throw AnalysisException during setup for …
rxin Jun 3, 2015
2c5a06c
Update documentation for [SPARK-7980] [SQL] Support SQLContext.range(…
rxin Jun 3, 2015
20a26b5
[SPARK-8054] [MLLIB] Added several Java-friendly APIs + unit tests
jkbradley Jun 3, 2015
c6a6dd0
[MINOR] [UI] Improve confusing message on log page
Jun 3, 2015
bfbf12b
[SPARK-8083] [MESOS] Use the correct base path in mesos driver page.
tnachen Jun 3, 2015
aa40c44
[SPARK-8059] [YARN] Wake up allocation thread when new requests arrive.
Jun 3, 2015
1d8669f
[SPARK-8001] [CORE] Make AsynchronousListenerBus.waitUntilEmpty throw…
zsxwing Jun 3, 2015
f271347
[SPARK-7989] [CORE] [TESTS] Fix flaky tests in ExternalShuffleService…
zsxwing Jun 3, 2015
a8f1f15
[HOTFIX] Fix Hadoop-1 build caused by #5792.
harishreedharan Jun 3, 2015
d3e026f
[SPARK-3674] [EC2] Clear SPARK_WORKER_INSTANCES when using YARN
shivaram Jun 3, 2015
26c9d7a
[SPARK-8051] [MLLIB] make StringIndexerModel silent if input column d…
mengxr Jun 3, 2015
d8662cd
[SPARK-6164] [ML] CrossValidatorModel should keep stats from fitting
leahmcguire Jun 3, 2015
bfbdab1
[HOTFIX] [TYPO] Fix typo in #6546
Jun 3, 2015
566cb59
[HOTFIX] History Server API docs error fix.
harishreedharan Jun 3, 2015
51898b5
[SPARK-8088] don't attempt to lower number of executors by 0
ryan-williams Jun 3, 2015
0576c3c
[SPARK-8084] [SPARKR] Make SparkR scripts fail on error
shivaram Jun 4, 2015
e35cd36
[BUILD] Increase Jenkins test timeout
Jun 4, 2015
9cf740f
[BUILD] Use right branch when checking against Hive
Jun 4, 2015
984ad60
[BUILD] Fix Maven build for Kinesis
Jun 4, 2015
9982d45
MAINTENANCE: Automated closing of pull requests.
pwendell Jun 4, 2015
10ba188
Fix maxTaskFailures comment
darabos Jun 4, 2015
c8709dc
[SPARK-7956] [SQL] Use Janino to compile SQL expressions into bytecode
Jun 4, 2015
df7da07
[SPARK-7969] [SQL] Added a DataFrame.drop function that accepts a Col…
dusenberrymw Jun 4, 2015
cd3176b
[SPARK-7743] [SQL] Parquet 1.7
Jun 4, 2015
3dc0052
[SPARK-8027] [SPARKR] Move man pages creation to install-dev.sh
shivaram Jun 4, 2015
0526fea
[SPARK-6909][SQL] Remove Hive Shim code
Jun 4, 2015
6593842
Fixed style issues for [SPARK-6909][SQL] Remove Hive Shim code.
rxin Jun 4, 2015
2bcdf8c
[SPARK-7440][SQL] Remove physical Distinct operator in favor of Aggre…
rxin Jun 4, 2015
63bc0c4
[SPARK-8098] [WEBUI] Show correct length of bytes on log page
carsonwang Jun 4, 2015
74dc2a9
[SPARK-8106] [SQL] Set derby.system.durability=test to speed up Hive …
JoshRosen Jun 5, 2015
8f16b94
[SPARK-8114][SQL] Remove some wildcard import on TestSQLContext._
rxin Jun 5, 2015
e505460
[SPARK-8116][PYSPARK] Allow sc.range() to take a single argument.
belisarius222 Jun 5, 2015
2777ed3
[DOC][Minor]Specify the common sources available for collecting
yjshen Jun 5, 2015
3a5c4da
[MINOR] remove unused interpolation var in log message
srowen Jun 5, 2015
da20c8c
[MINOR] [BUILD] Change link to jenkins builds on github.
Jun 5, 2015
b16b543
[MINOR] [BUILD] Use custom temp directory during build.
Jun 5, 2015
019dc9f
[STREAMING] Update streaming-kafka-integration.md
Jun 5, 2015
700312e
[SPARK-6324] [CORE] Centralize handling of script usage messages.
Jun 5, 2015
bc0d76a
[SQL] Simplifies binary node pattern matching
liancheng Jun 5, 2015
12f5eae
[SPARK-8085] [SPARKR] Support user-specified schema in read.df
shivaram Jun 5, 2015
4036d05
Revert "[MINOR] [BUILD] Use custom temp directory during build."
Jun 5, 2015
0992a0a
[SPARK-8099] set executor cores into system in yarn-cluster mode
XuTingjun Jun 5, 2015
3f80bc8
[SPARK-7699] [CORE] Lazy start the scheduler for dynamic allocation
jerryshao Jun 5, 2015
4f16d3f
[SPARK-8112] [STREAMING] Fix the negative event count issue
zsxwing Jun 5, 2015
4060526
[SPARK-7747] [SQL] [DOCS] spark.sql.planner.externalSort
lucamartinetti Jun 5, 2015
356a4a9
[SPARK-7991] [PySpark] Adding support for passing lists to describe.
Jun 5, 2015
6ebe419
[SPARK-8114][SQL] Remove some wildcard import on TestSQLContext._ con…
rxin Jun 5, 2015
eb19d3f
[SPARK-6964] [SQL] Support Cancellation in the Thrift Server
Jun 6, 2015
a71be0a
[SPARK-8114][SQL] Remove some wildcard import on TestSQLContext._ rou…
rxin Jun 6, 2015
a8077e5
[SPARK-6973] remove skipped stage ID from completed set on the allJob…
XuTingjun Jun 6, 2015
16fc496
[SPARK-8079] [SQL] Makes InsertIntoHadoopFsRelation job/task abortion…
liancheng Jun 6, 2015
5aa804f
[SPARK-7639] [PYSPARK] [MLLIB] Python API for KernelDensity
MechCoder Jun 6, 2015
18c4fce
[SPARK-7169] [CORE] Allow metrics system to be configured through Spa…
Jun 7, 2015
ed2cc3e
[SPARK-8136] [YARN] Fix flakiness in YarnClusterSuite.
harishreedharan Jun 7, 2015
3285a51
[SPARK-7955] [CORE] Ensure executors with cached RDD blocks are not re…
harishreedharan Jun 7, 2015
901a552
[SPARK-8004][SQL] Enclose column names by JDBC Dialect
viirya Jun 7, 2015
081db94
[SPARK-8145] [WEBUI] Trigger a double click on the span to show full …
Jun 7, 2015
26d07f1
[SPARK-8141] [SQL] Precompute datatypes for partition columns and reu…
viirya Jun 7, 2015
0ac4708
[SPARK-8146] DataFrame Python API: Alias replace in df.na
rxin Jun 7, 2015
8c321d6
[SPARK-8118] [SQL] Mutes noisy Parquet log output reappeared after up…
liancheng Jun 7, 2015
ca8dafc
[SPARK-7042] [BUILD] use the standard akka artifacts with hadoop-2.x
Jun 7, 2015
835f138
[DOC] [TYPO] Fix typo in standalone deploy scripts description
yjshen Jun 7, 2015
d6d601a
[SPARK-8004][SQL] Quote identifier in JDBC data source.
rxin Jun 7, 2015
db81b9d
[SPARK-7952][SQL] use internal Decimal instead of java.math.BigDecimal
cloud-fan Jun 7, 2015
e84815d
[SPARK-7733] [CORE] [BUILD] Update build, code to use Java 7 for 1.5.0+
srowen Jun 7, 2015
b127ff8
[SPARK-2808] [STREAMING] [KAFKA] cleanup tests from
koeninger Jun 7, 2015
5e7b6b6
[SPARK-8117] [SQL] Push codegen implementation into each Expression
Jun 7, 2015
f74be74
[SPARK-8149][SQL] Break ExpressionEvaluationSuite down to multiple files
rxin Jun 8, 2015
72ba0fc
[SPARK-8154][SQL] Remove Term/Code type aliases in code generation.
rxin Jun 8, 2015
10fc2f6
[SPARK-4761] [DOC] [SQL] kryo default setting in SQL Thrift server
adrian-wang Jun 8, 2015
eacd4a9
[SPARK-7705] [YARN] Cleanup of .sparkStaging directory fails if appli…
Sephiroth-Lin Jun 8, 2015
03ef6be
[SPARK-7939] [SQL] Add conf to enable/disable partition column type i…
viirya Jun 8, 2015
a1d9e5c
[SPARK-8126] [BUILD] Use custom temp directory during build.
Jun 8, 2015
e3e9c70
[SPARK-8140] [MLLIB] Remove empty model check in StreamingLinearAlgor…
MechCoder Jun 8, 2015
149d1b2
[SMALL FIX] Return null if catch EOFException
Jun 8, 2015
49f19b9
[MINOR] change new Exception to IllegalArgumentException
adrian-wang Jun 8, 2015
ed5c2dc
[SPARK-8158] [SQL] several fix for HiveShim
adrian-wang Jun 8, 2015
bbdfc0a
[SPARK-8121] [SQL] Fixes InsertIntoHadoopFsRelation job initializatio…
liancheng Jun 8, 2015
fe7669d
[SQL][minor] remove duplicated cases in `DecimalPrecision`
cloud-fan Jun 8, 2015
5185389
[SPARK-8148] Do not use FloatType in partition column inference.
rxin Jun 8, 2015
f3eec92
[SPARK-8162] [HOTFIX] Fix NPE in spark-shell
Jun 9, 2015
82870d5
[SPARK-8168] [MLLIB] Add Python friendly constructor to PipelineModel
mengxr Jun 9, 2015
a5c52c1
[SPARK-6820] [SPARKR] Convert NAs to null type in SparkR DataFrames
hqzizania Jun 9, 2015
7658eb2
[SPARK-7990][SQL] Add methods to facilitate equi-join on multiple joi…
viirya Jun 9, 2015
0902a11
[SPARK-8101] [CORE] Upgrade netty to avoid memory leak accord to nett…
srowen Jun 9, 2015
1b49999
[SPARK-7886] Add built-in expressions to FunctionRegistry.
rxin Jun 9, 2015
e6fb6ce
[STREAMING] [DOC] Remove duplicated description about WAL
sarutak Jun 9, 2015
6c1723a
[SPARK-8140] [MLLIB] Remove construct to get weights in StreamingLine…
MechCoder Jun 9, 2015
490d5a7
[SPARK-8274] [DOCUMENTATION-MLLIB] Fix wrong URLs in MLlib Frequent P…
FavioVazquez Jun 9, 2015
0d5892d
[MINOR] [UI] DAG visualization: trim whitespace from input
Jun 9, 2015
6e4fb0c
[SPARK-6511] [DOCUMENTATION] Explain how to use Hadoop provided builds
pwendell Jun 9, 2015
778f3ca
[SPARK-7792] [SQL] HiveContext registerTempTable not thread safe
navis Jun 10, 2015
57c60c5
[SPARK-7886] Use FunctionRegistry for built-in expressions in HiveCon…
rxin Jun 10, 2015
e90035e
[SPARK-7886] Added unit test for HAVING aggregate pushdown.
rxin Jun 10, 2015
c6ba7cc
[SPARK-8215] [SPARK-8212] [SQL] add leaf math expression for e and pi
adrian-wang Jun 10, 2015
2b550a5
[SPARK-7996] Deprecate the developer api SparkEnv.actorSystem
Jun 10, 2015
8f7308f
[SQL] [MINOR] Fixes a minor Java example error in SQL programming guide
liancheng Jun 10, 2015
3811290
[SPARK-5479] [YARN] Handle --py-files correctly in YARN.
Jun 10, 2015
30ebf1a
[SPARK-8282] [SPARKR] Make number of threads used in RBackend configu…
falaki Jun 10, 2015
19e30b4
[SPARK-7756] CORE RDDOperationScope fix for IBM Java
a-roberts Jun 10, 2015
e90c9d9
[SPARK-7527] [CORE] Fix createNullValue to return the correct null va…
zsxwing Jun 10, 2015
80043e9
[SPARK-7261] [CORE] Change default log level to WARN in the REPL
zsxwing Jun 10, 2015
cb871c4
[SPARK-8290] spark class command builder need read SPARK_JAVA_OPTS an…
WangTaoTheTonic Jun 10, 2015
5014d0e
[SPARK-8273] Driver hangs up when yarn shutdown in client mode
WangTaoTheTonic Jun 10, 2015
96a7c88
[SPARK-2774] Set preferred locations for reduce tasks
shivaram Jun 10, 2015
b928f54
[SPARK-8200] [MLLIB] Check for empty RDDs in StreamingLinearAlgorithm
pparkkin Jun 10, 2015
37719e0
[SPARK-8189] [SQL] use Long for TimestampType in SQL
Jun 10, 2015
6a47114
[SPARK-8285] [SQL] CombineSum should be calculated as unlimited decim…
navis Jun 11, 2015
4e42842
[SPARK-8164] transformExpressions should support nested expression se…
cloud-fan Jun 11, 2015
9fe3adc
[SPARK-8248][SQL] string function: length
chenghao-intel Jun 11, 2015
2758ff0
[SPARK-8217] [SQL] math function log2
adrian-wang Jun 11, 2015
a777eb0
[HOTFIX] Adding more contributor name bindings
pwendell Jun 11, 2015
e84545f
[HOTFIX] Fixing errors in name mappings
pwendell Jun 11, 2015
6b68366
[SPARK-8289] Specify stack size for consistency with Java tests - res…
a-roberts Jun 11, 2015
424b007
[SPARK-6411] [SQL] [PySpark] support date/datetime with timezone in P…
Jun 11, 2015
1191c3e
[SPARK-8305] [SPARK-8190] [SQL] improve codegen
Jun 11, 2015
c8d551d
[SPARK-8310] [EC2] Updates the master branch EC2 versions
shivaram Jun 11, 2015
040f223
[SPARK-7915] [SQL] Support specifying the column list for target tabl…
chenghao-intel Jun 11, 2015
95690a1
[SPARK-7444] [TESTS] Eliminate noisy css warn/error logs for UISeleni…
zsxwing Jun 11, 2015
9cbdf31
[SPARK-6511] [docs] Fix example command in hadoop-provided docs.
Jun 11, 2015
7d669a5
[SPARK-8286] Rewrite UTF8String in Java and move it into unsafe package.
rxin Jun 11, 2015
7914c72
[SPARK-7824] [SQL] Collapse operator reordering and constant folding …
pzzs Jun 12, 2015
337c16d
[SQL] Miscellaneous SQL/DF expression changes.
rxin Jun 12, 2015
767cc94
[SPARK-7158] [SQL] Fix bug of cached data cannot be used in collect()…
chenghao-intel Jun 12, 2015
b9d177c
[SPARK-8317] [SQL] Do not push sort into shuffle in Exchange operator
JoshRosen Jun 12, 2015
2dd7f93
[SPARK-7862] [SQL] Fix the deadlock in script transformation for stderr
zhichao-li Jun 12, 2015
e428b3a
[SPARK-6566] [SQL] Related changes for newer parquet version
Jun 12, 2015
c19c785
[SQL] [MINOR] correct semanticEquals logic
cloud-fan Jun 12, 2015
71cc17b
[SPARK-8322] [EC2] Added spark 1.4.0 into the VALID_SPARK_VERSIONS and…
Jun 12, 2015
19834fa
[SPARK-7993] [SQL] Improved DataFrame.show() output
Jun 12, 2015
8860405
[SPARK-8330] DAG visualization: trim whitespace from input
Jun 12, 2015
e9471d3
[SPARK-7284] [STREAMING] Updated streaming documentation
tdas Jun 12, 2015
6e9c3ff
[SPARK-8314][MLlib] improvement in performance of MLUtils.appendBias
rogermenezes Jun 13, 2015
d46f8e5
[SPARK-7186] [SQL] Decouple internal Row from external Row
Jun 13, 2015
4aed66f
[SPARK-8329][SQL] Allow _ in DataSource options
marmbrus Jun 13, 2015
d986fb9
[SPARK-7897] Improbe type for jdbc/"unsigned bigint"
rtreffer Jun 13, 2015
ce1041c
[SPARK-8346] [SQL] Use InternalRow instread of catalyst.InternalRow
Jun 13, 2015
af31335
[SPARK-8319] [CORE] [SQL] Update logic related to key orderings in sh…
JoshRosen Jun 13, 2015
ddec452
[SPARK-8052] [SQL] Use java.math.BigDecimal for casting String to Dec…
viirya Jun 13, 2015
a138953
[SPARK-8347][SQL] Add unit tests for abs.
rxin Jun 14, 2015
2d71ba4
[SPARK-8349] [SQL] Use expression constructors (rather than apply) in…
rxin Jun 14, 2015
35d1267
[Spark-8343] [Streaming] [Docs] Improve Spark Streaming Guides.
dusenberrymw Jun 14, 2015
cb7ada1
[SPARK-8342][SQL] Fix Decimal setOrNull
viirya Jun 14, 2015
ea7fd2f
[SPARK-8354] [SQL] Fix off-by-factor-of-8 error when allocating scrat…
JoshRosen Jun 14, 2015
9073a42
[SPARK-8358] [SQL] Wait for child resolution when resolving generators
marmbrus Jun 14, 2015
53c16b9
[SPARK-8362] [SQL] Add unit tests for +, -, *, /, %
rxin Jun 14, 2015
f3f2a43
fix read/write mixup
hoffmann Jun 14, 2015
4eb48ed
[SPARK-8065] [SQL] Add support for Hive 0.14 metastores
Jun 14, 2015
4c5889e
[SPARK-8316] Upgrade to Maven 3.3.3
nchammas Jun 15, 2015
56d4e8a
[SPARK-8350] [R] Log R unit test output to "unit-tests.log"
Jun 15, 2015
6ae21a9
[SPARK-6583] [SQL] Support aggregate functions in ORDER BY
watermen Jun 15, 2015
1a62d61
SPARK-8336 Fix NullPointerException with functions.rand()
tedyu Jun 16, 2015
bc76a0f
[SPARK-7184] [SQL] enable codegen by default
Jun 16, 2015
ccf010f
[SPARK-8367] [STREAMING] Add a limit for 'spark.streaming.blockInterv…
SaintBacchus Jun 16, 2015
658814c
[SPARK-8129] [CORE] [Sec] Pass auth secrets to executors via env vari…
kanzhang Jun 16, 2015
29c5025
[SPARK-8387] [WEBUI] Only show 4096 bytes content for executor log in…
suyanNone Jun 16, 2015
dc455b8
[SPARK-DOCS] [SPARK-SQL] Update sql-programming-guide.md
moutai Jun 16, 2015
4bd10fd
[SQL] [DOC] improved a comment
radek1st Jun 16, 2015
cebf241
[SPARK-8126] [BUILD] Make sure temp dir exists when running tests.
Jun 16, 2015
ca99875
[SPARK-7916] [MLLIB] MLlib Python doc parity check for classification…
yanboliang Jun 16, 2015
0b8c8fd
[SPARK-8156] [SQL] create table to specific database by 'use dbname'
baishuo Jun 16, 2015
bedff7d
[SPARK-8220][SQL]Add positive identify function
zhichao-li Jun 17, 2015
e3de14d
Closes #6850.
rxin Jun 17, 2015
c13da20
[SPARK-8309] [CORE] Support for more than 12M items in OpenHashMap
SlavikBaranov Jun 17, 2015
104f30c
[SPARK-7199] [SQL] Add date and timestamp support to UnsafeRow
viirya Jun 17, 2015
6765ef9
[SPARK-6390] [SQL] [MLlib] Port MatrixUDT to PySpark
MechCoder Jun 17, 2015
50a0496
[SPARK-7017] [BUILD] [PROJECT INFRA] Refactor dev/run-tests into Python
Jun 17, 2015
0c1b2df
[SPARK-8077] [SQL] Optimization for TreeNodes with large numbers of c…
MickDavies Jun 17, 2015
f005be0
[SPARK-8395] [DOCS] start-slave.sh docs incorrect
srowen Jun 17, 2015
a465944
[SPARK-6782] add sbt-revolver plugin
squito Jun 17, 2015
98ee351
[SPARK-8010] [SQL] Promote types to StringType as implicit conversion…
OopsOutOfMemory Jun 17, 2015
7ad8c5d
[SPARK-8161] Set externalBlockStoreInitialized to be true, after Exte…
Jun 17, 2015
2837e06
[SPARK-8372] History server shows incorrect information for applicati…
carsonwang Jun 17, 2015
0fc4b96
[SPARK-8373] [PYSPARK] Add emptyRDD to pyspark and fix the issue when…
zsxwing Jun 17, 2015
a411a40
[SPARK-7913] [CORE] Increase the maximum capacity of PartitionedPairB…
zsxwing Jun 17, 2015
7f05b1f
[SPARK-7067] [SQL] fix bug when use complex nested fields in ORDER BY
cloud-fan Jun 17, 2015
302556f
[SPARK-8306] [SQL] AddJar command needs to set the new class loader t…
yhuai Jun 17, 2015
a06d9c8
[SPARK-8404] [STREAMING] [TESTS] Use thread-safe collections to make …
zsxwing Jun 17, 2015
d1069cb
[SPARK-8397] [SQL] Allow custom configuration for TestHive
Jun 17, 2015
165f52f
[HOTFIX] [PROJECT-INFRA] Fix bug in dev/run-tests for MLlib-only PRs
JoshRosen Jun 18, 2015
4817ccd
[SPARK-8373] [PYSPARK] Remove PythonRDD.emptyRDD
zsxwing Jun 18, 2015
22732e1
[SPARK-7605] [MLLIB] [PYSPARK] Python API for ElementwiseProduct
MechCoder Jun 18, 2015
e2cdb05
[SPARK-8392] RDDOperationGraph: getting cached nodes is slow
XuTingjun Jun 18, 2015
3b61077
[SPARK-8095] Resolve dependencies of --packages in local ivy cache
brkyvz Jun 18, 2015
9db73ec
[SPARK-8381][SQL]reuse typeConvert when convert Seq[Row] to catalyst …
lianhuiwang Jun 18, 2015
78a430e
[SPARK-7961][SQL]Refactor SQLConf to display better error message
zsxwing Jun 18, 2015
fee3438
[SPARK-8218][SQL] Add binary log math function
viirya Jun 18, 2015
e86fbdb
[SPARK-8283][SQL] Resolve udf_struct test failure in HiveCompatibilit…
yjshen Jun 18, 2015
ddc5baf
[SPARK-8320] [STREAMING] Add example in streaming programming guide t…
Jun 18, 2015
3164112
[SPARK-8363][SQL] Move sqrt to math and extend UnaryMathExpression
viirya Jun 18, 2015
9b20027
[SPARK-8202] [PYSPARK] fix infinite loop during external sort in PySpark
Jun 18, 2015
44c931f
[SPARK-8353] [DOCS] Show anchor links when hovering over documentatio…
JoshRosen Jun 18, 2015
24e5379
[SPARK-8376] [DOCS] Add common lang3 to the Spark Flume Sink doc
zsxwing Jun 18, 2015
207a98c
[SPARK-8446] [SQL] Add helper functions for testing SparkPlan physica…
JoshRosen Jun 18, 2015
dc41313
[SPARK-8218][SQL] Binary log math function update.
rxin Jun 19, 2015
43f50de
[SPARK-8135] Don't load defaults when reconstituting Hadoop Configura…
sryza Jun 19, 2015
4ce3bab
[SPARK-8462] [DOCS] Documentation fixes for Spark SQL
lfrancke Jun 19, 2015
3eaed87
[SPARK-8080] [STREAMING] Receiver.store with Iterator does not give c…
Jun 19, 2015
a71cbbd
[SPARK-8458] [SQL] Don't strip scheme part of output path when writin…
liancheng Jun 19, 2015
754929b
[SPARK-8348][SQL] Add in operator to DataFrame Column
yu-iskw Jun 19, 2015
a2016b4
[SPARK-8444] [STREAMING] Adding Python streaming example for queueStream
BryanCutler Jun 19, 2015
fdf63f1
[SPARK-8339] [PYSPARK] integer division for python 3
kconor Jun 19, 2015
54557f3
[SPARK-8387] [FOLLOWUP ] [WEBUI] Update driver log URL to show only 4…
carsonwang Jun 19, 2015
93360dc
[SPARK-7913] [CORE] Make AppendOnlyMap use the same growth strategy o…
zsxwing Jun 19, 2015
ebd363a
[SPARK-7265] Improving documentation for Spark SQL Hive support
JihongMA Jun 19, 2015
47af7c1
[SPARK-8389] [STREAMING] [KAFKA] Example of getting offset ranges out o…
koeninger Jun 19, 2015
43c7ec6
[SPARK-8151] [MLLIB] pipeline components should correctly implement copy
mengxr Jun 19, 2015
2c59d5c
[SPARK-8207] [SQL] Add math function bin
viirya Jun 19, 2015
9baf093
[SPARK-8430] ExternalShuffleBlockResolver of shuffle service should s…
lianhuiwang Jun 19, 2015
fe08561
[SPARK-8476] [CORE] Setters inc/decDiskBytesSpilled in TaskMetrics sh…
ueshin Jun 19, 2015
0c32fc1
[SPARK-8234][SQL] misc function: md5
qiansl127 Jun 19, 2015
a985803
Add example that reads a local file, writes to a DFS path provided by…
rnowling Jun 19, 2015
866816e
[SPARK-7180] [SPARK-8090] [SPARK-8091] Fix a number of SerializationD…
tdas Jun 19, 2015
68a2dca
[SPARK-8451] [SPARK-7287] SparkSubmitSuite should check exit code
Jun 19, 2015
4be53d0
[SPARK-5836] [DOCS] [STREAMING] Clarify what may cause long-running S…
srowen Jun 19, 2015
c5876e5
[SPARK-8368] [SPARK-8058] [SQL] HiveContext may override the context …
yhuai Jun 19, 2015
4a462c2
[HOTFIX] Fix scala style in DFSReadWriteTest that causes tests failed
viirya Jun 19, 2015
e41e2fd
[SPARK-8461] [SQL] fix codegen with REPL class loader
Jun 19, 2015
54976e5
[SPARK-4118] [MLLIB] [PYSPARK] Python bindings for StreamingKMeans
MechCoder Jun 19, 2015
1fa29c2
[SPARK-8452] [SPARKR] expose jobGroup API in SparkR
falaki Jun 19, 2015
9814b97
[SPARK-8093] [SQL] Remove empty structs inferred from JSON documents
Jun 19, 2015
a333a72
[SPARK-8420] [SQL] Fix comparision of timestamps/dates with strings
marmbrus Jun 19, 2015
b305e37
[SPARK-8390] [STREAMING] [KAFKA] fix docs related to HasOffsetRanges
koeninger Jun 20, 2015
093c348
[SPARK-8498] [SQL] Add regression test for SPARK-8470
Jun 20, 2015
bec40e5
[HOTFIX] [SPARK-8489] Correct JIRA number in previous commit
Jun 20, 2015
1b6fe9b
[SPARK-8127] [STREAMING] [KAFKA] KafkaRDD optimize count() take() isE…
koeninger Jun 20, 2015
0b89951
[SPARK-8468] [ML] Take the negative of some metrics in RegressionEval…
viirya Jun 20, 2015
7a3c424
[SPARK-8422] [BUILD] [PROJECT INFRA] Add a module abstraction to dev/…
JoshRosen Jun 20, 2015
004f573
[SPARK-8495] [SPARKR] Add a `.lintr` file to validate the SparkR file…
yu-iskw Jun 20, 2015
41ab285
[SPARK-8301] [SQL] Improve UTF8String substring/startsWith/endsWith/c…
tarekbecker Jun 21, 2015
a1e3649
[SPARK-8379] [SQL] avoid speculative tasks write to the same file
jeanlyn Jun 21, 2015
32e3cda
[SPARK-7604] [MLLIB] Python API for PCA and PCAModel
yanboliang Jun 21, 2015
83cdfd8
[SPARK-8508] [SQL] Ignores a test case to cleanup unnecessary testing…
liancheng Jun 21, 2015
a189442
[SPARK-7715] [MLLIB] [ML] [DOC] Updated MLlib programming guide for r…
jkbradley Jun 21, 2015
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ scalastyle-output.xml
R-unit-tests.log
R/unit-tests.out
python/lib/pyspark.zip
lint-r-report.log

# For Hive
metastore_db/
Expand Down
5 changes: 5 additions & 0 deletions .rat-excludes
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ spark-env.sh
spark-env.cmd
spark-env.sh.template
log4j-defaults.properties
log4j-defaults-repl.properties
bootstrap-tooltip.js
jquery-1.11.1.min.js
d3.min.js
Expand Down Expand Up @@ -80,5 +81,9 @@ local-1425081759269/*
local-1426533911241/*
local-1426633911242/*
local-1430917381534/*
local-1430917381535_1
local-1430917381535_2
DESCRIPTION
NAMESPACE
test_support/*
.lintr
65 changes: 64 additions & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -836,6 +836,68 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

========================================================================
For vis.js (core/src/main/resources/org/apache/spark/ui/static/vis.min.js):
========================================================================
Copyright (C) 2010-2015 Almende B.V.

Vis.js is dual licensed under both

* The Apache 2.0 License
http://www.apache.org/licenses/LICENSE-2.0

and

* The MIT License
http://opensource.org/licenses/MIT

Vis.js may be distributed under either license.

========================================================================
For dagre-d3 (core/src/main/resources/org/apache/spark/ui/static/dagre-d3.min.js):
========================================================================
Copyright (c) 2013 Chris Pettitt

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

========================================================================
For graphlib-dot (core/src/main/resources/org/apache/spark/ui/static/graphlib-dot.min.js):
========================================================================
Copyright (c) 2012-2013 Chris Pettitt

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

========================================================================
BSD-style licenses
Expand All @@ -845,7 +907,7 @@ The following components are provided under a BSD-style license. See project lin

(BSD 3 Clause) core (com.github.fommil.netlib:core:1.1.2 - https://github.com/fommil/netlib-java/core)
(BSD 3 Clause) JPMML-Model (org.jpmml:pmml-model:1.1.15 - https://github.com/jpmml/jpmml-model)
(BSD 3-clause style license) jblas (org.jblas:jblas:1.2.3 - http://jblas.org/)
(BSD 3-clause style license) jblas (org.jblas:jblas:1.2.4 - http://jblas.org/)
(BSD License) AntLR Parser Generator (antlr:antlr:2.7.7 - http://www.antlr.org/)
(BSD License) Javolution (javolution:javolution:5.5.1 - http://javolution.org)
(BSD licence) ANTLR ST4 4.0.4 (org.antlr:ST4:4.0.4 - http://www.stringtemplate.org)
Expand Down Expand Up @@ -888,3 +950,4 @@ The following components are provided under the MIT License. See project link fo
(MIT License) scopt (com.github.scopt:scopt_2.10:3.2.0 - https://github.com/scopt/scopt)
(The MIT License) Mockito (org.mockito:mockito-all:1.8.5 - http://www.mockito.org)
(MIT License) jquery (https://jquery.org/license/)
(MIT License) AnchorJS (https://github.com/bryanbraun/anchorjs)
4 changes: 2 additions & 2 deletions R/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ The SparkR documentation (Rd files and HTML files) are not a part of the source
SparkR comes with several sample programs in the `examples/src/main/r` directory.
To run one of them, use `./bin/sparkR <filename> <args>`. For example:

./bin/sparkR examples/src/main/r/pi.R local[2]
./bin/sparkR examples/src/main/r/dataframe.R

You can also run the unit-tests for SparkR by running (you need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first):

Expand All @@ -63,5 +63,5 @@ You can also run the unit-tests for SparkR by running (you need to install the [
The `./bin/spark-submit` and `./bin/sparkR` can also be used to submit jobs to YARN clusters. You will need to set YARN conf dir before doing so. For example on CDH you can run
```
export YARN_CONF_DIR=/etc/hadoop/conf
./bin/spark-submit --master yarn examples/src/main/r/pi.R 4
./bin/spark-submit --master yarn examples/src/main/r/dataframe.R
```
8 changes: 4 additions & 4 deletions R/create-docs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,14 +23,14 @@
# After running this script the html docs can be found in
# $SPARK_HOME/R/pkg/html

set -o pipefail
set -e

# Figure out where the script is
export FWDIR="$(cd "`dirname "$0"`"; pwd)"
pushd $FWDIR

# Generate Rd file
Rscript -e 'library(devtools); devtools::document(pkg="./pkg", roclets=c("rd"))'

# Install the package
# Install the package (this will also generate the Rd files)
./install-dev.sh

# Now create HTML files
Expand Down
11 changes: 10 additions & 1 deletion R/install-dev.sh
Original file line number Diff line number Diff line change
Expand Up @@ -26,11 +26,20 @@
# NOTE(shivaram): Right now we use $SPARK_HOME/R/lib to be the installation directory
# to load the SparkR package on the worker nodes.

set -o pipefail
set -e

FWDIR="$(cd `dirname $0`; pwd)"
LIB_DIR="$FWDIR/lib"

mkdir -p $LIB_DIR

# Install R
pushd $FWDIR

# Generate Rd files if devtools is installed
Rscript -e ' if("devtools" %in% rownames(installed.packages())) { library(devtools); devtools::document(pkg="./pkg", roclets=c("rd")) }'

# Install SparkR to $LIB_DIR
R CMD INSTALL --library=$LIB_DIR $FWDIR/pkg/

popd
2 changes: 1 addition & 1 deletion R/log4j.properties
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
log4j.rootCategory=INFO, file
log4j.appender.file=org.apache.log4j.FileAppender
log4j.appender.file.append=true
log4j.appender.file.file=R-unit-tests.log
log4j.appender.file.file=R/target/unit-tests.log
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss.SSS} %t %p %c{1}: %m%n

Expand Down
2 changes: 2 additions & 0 deletions R/pkg/.lintr
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
linters: with_defaults(line_length_linter(100), camel_case_linter = NULL)
exclusions: list("inst/profile/general.R" = 1, "inst/profile/shell.R")
52 changes: 47 additions & 5 deletions R/pkg/NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,25 +1,37 @@
# Imports from base R
importFrom(methods, setGeneric, setMethod, setOldClass)
useDynLib(SparkR, stringHashCode)

# Disable native libraries till we figure out how to package it
# See SPARKR-7839
#useDynLib(SparkR, stringHashCode)

# S3 methods exported
export("sparkR.init")
export("sparkR.stop")
export("print.jobj")

# Job group lifecycle management methods
export("setJobGroup",
"clearJobGroup",
"cancelJobGroup")

exportClasses("DataFrame")

exportMethods("cache",
exportMethods("arrange",
"cache",
"collect",
"columns",
"count",
"describe",
"distinct",
"dropna",
"dtypes",
"except",
"explain",
"fillna",
"filter",
"first",
"group_by",
"groupBy",
"head",
"insertInto",
Expand All @@ -28,12 +40,15 @@ exportMethods("cache",
"join",
"limit",
"orderBy",
"mutate",
"names",
"persist",
"printSchema",
"registerTempTable",
"rename",
"repartition",
"sampleDF",
"sample",
"sample_frac",
"saveAsParquetFile",
"saveAsTable",
"saveDF",
Expand All @@ -42,42 +57,68 @@ exportMethods("cache",
"selectExpr",
"show",
"showDF",
"sortDF",
"summarize",
"take",
"unionAll",
"unpersist",
"where",
"withColumn",
"withColumnRenamed")
"withColumnRenamed",
"write.df")

exportClasses("Column")

exportMethods("abs",
"acos",
"alias",
"approxCountDistinct",
"asc",
"asin",
"atan",
"atan2",
"avg",
"cast",
"cbrt",
"ceiling",
"contains",
"cos",
"cosh",
"countDistinct",
"desc",
"endsWith",
"exp",
"expm1",
"floor",
"getField",
"getItem",
"hypot",
"isNotNull",
"isNull",
"last",
"like",
"log",
"log10",
"log1p",
"lower",
"max",
"mean",
"min",
"n",
"n_distinct",
"rint",
"rlike",
"sign",
"sin",
"sinh",
"sqrt",
"startsWith",
"substr",
"sum",
"sumDistinct",
"tan",
"tanh",
"toDegrees",
"toRadians",
"upper")

exportClasses("GroupedData")
Expand All @@ -94,6 +135,7 @@ export("cacheTable",
"jsonFile",
"loadDF",
"parquetFile",
"read.df",
"sql",
"table",
"tableNames",
Expand Down
Loading