Skip to content

Add range partitioner on DataFrame #13923

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3,726 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
3726 commits
Select commit Hold shift + click to select a range
60ba436
[SPARK-16453][BUILD] release-build.sh is missing hive-thriftserver fo…
yhuai Jul 8, 2016
3b22291
[SPARK-16387][SQL] JDBC Writer should use dialect to quote field names.
dongjoon-hyun Jul 8, 2016
fd6e8f0
[SPARK-13569][STREAMING][KAFKA] pattern based topic subscription
koeninger Jul 9, 2016
6cef018
[SPARK-16376][WEBUI][SPARK WEB UI][APP-ID] HTTP ERROR 500 when using …
srowen Jul 9, 2016
d8b06f1
[SPARK-16432] Empty blocks fail to serialize due to assert in Chunked…
ericl Jul 9, 2016
b1db26a
[SPARK-11857][MESOS] Deprecate fine grained
Jul 9, 2016
7374e51
[SPARK-16401][SQL] Data Source API: Enable Extending RelationProvider…
gatorsmile Jul 9, 2016
f12a38b
[SPARK-15467][BUILD] update janino version to 3.0.0
kiszk Jul 11, 2016
52b5bb0
[SPARK-16476] Restructure MimaExcludes for easier union excludes
rxin Jul 11, 2016
82f0874
[SPARK-16318][SQL] Implement all remaining xpath functions
petermaxlee Jul 11, 2016
e226278
[SPARK-16355][SPARK-16354][SQL] Fix Bugs When LIMIT/TABLESAMPLE is No…
gatorsmile Jul 11, 2016
9cb1eb7
[SPARK-16381][SQL][SPARKR] Update SQL examples and programming guide …
keypointt Jul 11, 2016
7ac79da
[SPARK-16459][SQL] Prevent dropping current database
dongjoon-hyun Jul 11, 2016
ffcb6e0
[SPARK-16477] Bump master version to 2.1.0-SNAPSHOT
rxin Jul 11, 2016
840853e
[SPARK-16458][SQL] SessionCatalog should support `listColumns` for te…
dongjoon-hyun Jul 11, 2016
2ad031b
[SPARKR][DOC] SparkR ML user guides update for 2.0
yanboliang Jul 11, 2016
7f38b9d
[SPARK-16144][SPARKR] update R API doc for mllib
felixcheung Jul 11, 2016
b4fbe14
[SPARK-16349][SQL] Fall back to isolated class loader when classes no…
Jul 11, 2016
9e2c763
[SPARK-16114][SQL] structured streaming event time window example
jjthomas Jul 12, 2016
05d7151
[MINOR][STREAMING][DOCS] Minor changes on kinesis integration
keypointt Jul 12, 2016
91a443b
[SPARK-16433][SQL] Improve StreamingQuery.explain when no data arrives
zsxwing Jul 12, 2016
e50efd5
[SPARK-16430][SQL][STREAMING] Fixed bug in the maxFilesPerTrigger in …
tdas Jul 12, 2016
9cc74f9
[SPARK-16488] Fix codegen variable namespace collision in pmod and pa…
sameeragarwal Jul 12, 2016
b1e5281
[SPARK-12639][SQL] Mark Filters Fully Handled By Sources with *
RussellSpitzer Jul 12, 2016
c9a6762
[SPARK-16199][SQL] Add a method to list the referenced columns in dat…
petermaxlee Jul 12, 2016
fc11c50
[MINOR][ML] update comment where is inconsistent with code in ml.regr…
WeichenXu123 Jul 12, 2016
5b28e02
[SPARK-16189][SQL] Add ExternalRDD logical plan for input with RDD to…
ueshin Jul 12, 2016
6cb75db
[SPARK-16470][ML][OPTIMIZER] Check linear regression training whether…
WeichenXu123 Jul 12, 2016
5ad68ba
[SPARK-15752][SQL] Optimize metadata only query that has an aggregate…
lianhuiwang Jul 12, 2016
c377e49
[SPARK-16489][SQL] Guard against variable reuse mistakes in expressio…
rxin Jul 12, 2016
d513c99
[SPARK-16414][YARN] Fix bugs for "Can not get user config when callin…
sharkdtu Jul 12, 2016
68df47a
[SPARK-16405] Add metrics and source for external shuffle service
lovexi Jul 12, 2016
7f96886
[SPARK-16119][SQL] Support PURGE option to drop table / partition.
Jul 12, 2016
56bd399
[SPARK-16284][SQL] Implement reflect SQL function
petermaxlee Jul 13, 2016
1c58fa9
[SPARK-16514][SQL] Fix various regex codegen bugs
ericl Jul 13, 2016
772c213
[SPARK-16303][DOCS][EXAMPLES] Updated SQL programming guide and examples
Jul 13, 2016
c190d89
[SPARK-15889][STREAMING] Follow-up fix to erroneous condition in Stre…
srowen Jul 13, 2016
f156136
[SPARK-16375][WEB UI] Fixed misassigned var: numCompletedTasks was as…
ajbozarth Jul 13, 2016
f73891e
[MINOR] Fix Java style errors and remove unused imports
keypointt Jul 13, 2016
83879eb
[SPARK-16439] Fix number formatting in SQL UI
Jul 13, 2016
bf107f1
[SPARK-16438] Add Asynchronous Actions documentation
phalodi Jul 13, 2016
3d6f679
[MINOR][YARN] Fix code error in yarn-cluster unit test
sharkdtu Jul 13, 2016
51ade51
[SPARK-16440][MLLIB] Undeleted broadcast variables in Word2Vec causin…
srowen Jul 13, 2016
ea06e4e
[SPARK-16469] enhanced simulate multiply
uzadude Jul 13, 2016
f376c37
[SPARK-16343][SQL] Improve the PushDownPredicate rule to pushdown pre…
jiangxb1987 Jul 13, 2016
d8220c1
[SPARK-16435][YARN][MINOR] Add warning log if initialExecutors is les…
jerryshao Jul 13, 2016
01f09b1
[SPARK-14812][ML][MLLIB][PYTHON] Experimental, DeveloperApi annotatio…
jkbradley Jul 13, 2016
0744d84
[SPARK-16531][SQL][TEST] Remove timezone setting from DataFrameTimeWi…
brkyvz Jul 13, 2016
51a6706
[SPARK-16114][SQL] updated structured streaming guide
jjthomas Jul 13, 2016
b4baf08
[SPARKR][MINOR] R examples and test updates
felixcheung Jul 13, 2016
fb2e8ee
[SPARKR][DOCS][MINOR] R programming guide to include csv data source …
felixcheung Jul 13, 2016
c5ec879
[SPARK-16482][SQL] Describe Table Command for Tables Requiring Runtim…
gatorsmile Jul 13, 2016
a5f51e2
[SPARK-16485][ML][DOC] Fix privacy of GLM members, rename sqlDataType…
jkbradley Jul 13, 2016
9c53057
[SPARK-16536][SQL][PYSPARK][MINOR] Expose `sql` in PySpark Shell
dongjoon-hyun Jul 14, 2016
39c836e
[SPARK-16503] SparkSession should provide Spark version
lw-lin Jul 14, 2016
db7317a
[SPARK-16448] RemoveAliasOnlyProject should not remove alias with met…
cloud-fan Jul 14, 2016
252d4f2
[SPARK-16500][ML][MLLIB][OPTIMIZER] add LBFGS convergence warning for…
WeichenXu123 Jul 14, 2016
e3f8a03
[SPARK-16403][EXAMPLES] Cleanup to remove unused imports, consistent …
BryanCutler Jul 14, 2016
c4bc2ed
[SPARK-14963][MINOR][YARN] Fix typo in YarnShuffleService recovery fi…
jerryshao Jul 14, 2016
b7b5e17
[SPARK-16505][YARN] Optionally propagate error during shuffle service…
Jul 14, 2016
1b5c9e5
[SPARK-16530][SQL][TRIVIAL] Wrong Parser Keyword in ALTER TABLE CHANG…
gatorsmile Jul 14, 2016
56183b8
[SPARK-16543][SQL] Rename the columns of `SHOW PARTITION/COLUMNS` com…
dongjoon-hyun Jul 14, 2016
093ebbc
[SPARK-16509][SPARKR] Rename window.partitionBy and window.orderBy to…
sun-rui Jul 14, 2016
12005c8
[SPARK-16538][SPARKR] fix R call with namespace operator on SparkSess…
felixcheung Jul 14, 2016
c576f9f
[SPARK-16529][SQL][TEST] `withTempDatabase` should set `default` data…
dongjoon-hyun Jul 14, 2016
31ca741
[SPARK-16528][SQL] Fix NPE problem in HiveClientImpl
jacek-lewandowski Jul 14, 2016
91575ca
[SPARK-16540][YARN][CORE] Avoid adding jars twice for Spark running o…
jerryshao Jul 14, 2016
01c4c1f
[SPARK-16553][DOCS] Fix SQL example file name in docs
shivaram Jul 14, 2016
972673a
[SPARK-16555] Work around Jekyll error-handling bug which led to sile…
JoshRosen Jul 14, 2016
2e4075e
[SPARK-16557][SQL] Remove stale doc in sql/README.md
rxin Jul 15, 2016
1832423
[SPARK-16546][SQL][PYSPARK] update python dataframe.drop
WeichenXu123 Jul 15, 2016
71ad945
[SPARK-16426][MLLIB] Fix bug that caused NaNs in IsotonicRegression
neggert Jul 15, 2016
5ffd5d3
[SPARK-14817][ML][MLLIB][DOC] Made DataFrame-based API primary in MLl…
jkbradley Jul 15, 2016
611a8ca
[SPARK-16538][SPARKR] Add more tests for namespace call to SparkSessi…
felixcheung Jul 15, 2016
b2f24f9
[SPARK-16230][CORE] CoarseGrainedExecutorBackend to self kill if ther…
tejasapatil Jul 15, 2016
a1ffbad
[SPARK-16582][SQL] Explicitly define isNull = false for non-nullable …
sameeragarwal Jul 16, 2016
5ec0d69
[SPARK-3359][DOCS] More changes to resolve javadoc 8 errors that will…
srowen Jul 16, 2016
4167304
[SPARK-16112][SPARKR] Programming guide for gapply/gapplyCollect
Jul 16, 2016
c33e4b0
[SPARK-16507][SPARKR] Add a CRAN checker, fix Rd aliases
shivaram Jul 17, 2016
7b84758
[SPARK-16584][SQL] Move regexp unit tests to RegexpExpressionsSuite
rxin Jul 17, 2016
d27fe9b
[SPARK-16027][SPARKR] Fix R tests SparkSession init/stop
felixcheung Jul 18, 2016
480c870
[SPARK-16588][SQL] Deprecate monotonicallyIncreasingId in Scala/Java
rxin Jul 18, 2016
a529fc9
[MINOR][TYPO] fix fininsh typo
WeichenXu123 Jul 18, 2016
8ea3f4e
[SPARK-16055][SPARKR] warning added while using sparkPackages with sp…
krishnakalyan3 Jul 18, 2016
2877f1a
[SPARK-16351][SQL] Avoid per-record type dispatch in JSON when writing
HyukjinKwon Jul 18, 2016
96e9afa
[SPARK-16515][SQL] set default record reader and writer for script tr…
adrian-wang Jul 18, 2016
75f0efe
[SPARKR][DOCS] minor code sample update in R programming guide
felixcheung Jul 18, 2016
ea78edb
[SPARK-16590][SQL] Improve LogicalPlanToSQLSuite to check generated S…
dongjoon-hyun Jul 19, 2016
c4524f5
[HOTFIX] Fix Scala 2.10 compilation
rxin Jul 19, 2016
69c7730
[SPARK-16615][SQL] Expose sqlContext in SparkSession
rxin Jul 19, 2016
e5fbb18
[MINOR] Remove unused arg in als.py
zhengruifeng Jul 19, 2016
1426a08
[SPARK-16303][DOCS][EXAMPLES] Minor Scala/Java example update
liancheng Jul 19, 2016
6ee40d2
[DOC] improve python doc for rdd.histogram and dataframe.join
mortada Jul 19, 2016
556a943
[MINOR][BUILD] Fix Java Linter `LineLength` errors
dongjoon-hyun Jul 19, 2016
21a6dd2
[SPARK-16535][BUILD] In pom.xml, remove groupId which is redundant de…
keypointt Jul 19, 2016
6caa220
[MINOR][SQL][STREAMING][DOCS] Fix minor typos, punctuations and grammar
ahmed-mahran Jul 19, 2016
8310c07
[SPARK-16600][MLLIB] fix some latex formula syntax error
WeichenXu123 Jul 19, 2016
6c4b9f4
[SPARK-16395][STREAMING] Fail if too many CheckpointWriteHandlers are…
srowen Jul 19, 2016
5d92326
[SPARK-16478] graphX (added graph caching in strongly connected compo…
Jul 19, 2016
6708914
[SPARK-16494][ML] Upgrade breeze version to 0.12
yanboliang Jul 19, 2016
0bd76e8
[SPARK-16620][CORE] Add back the tokenization process in `RDD.pipe(co…
lw-lin Jul 19, 2016
162d04a
[SPARK-16602][SQL] `Nvl` function should support numeric-string cases
dongjoon-hyun Jul 19, 2016
2ae7b88
[SPARK-15705][SQL] Change the default value of spark.sql.hive.convert…
yhuai Jul 19, 2016
004e29c
[SPARK-14702] Make environment of SparkLauncher launched process more…
Jul 20, 2016
9674af6
[SPARK-16568][SQL][DOCUMENTATION] update sql programming guide refres…
WeichenXu123 Jul 20, 2016
fc23263
[SPARK-10683][SPARK-16510][SPARKR] Move SparkR include jar test to Sp…
shivaram Jul 20, 2016
75146be
[SPARK-16632][SQL] Respect Hive schema when merging parquet schema.
Jul 20, 2016
0dc79ff
[SPARK-16440][MLLIB] Destroy broadcasted variables even on driver
Jul 20, 2016
95abbe5
[SPARK-15923][YARN] Spark Application rest api returns 'no such app: …
weiqingy Jul 20, 2016
4b079dc
[SPARK-16613][CORE] RDD.pipe returns values for empty partitions
srowen Jul 20, 2016
b9bab4d
[SPARK-15951] Change Executors Page to use datatables to support sort…
kishorvpatil Jul 20, 2016
e3cd5b3
[SPARK-16634][SQL] Workaround JVM bug by moving some code out of ctor.
Jul 20, 2016
e651900
[SPARK-16344][SQL] Decoding Parquet array of struct with a single fie…
liancheng Jul 20, 2016
75a06aa
[SPARK-16272][CORE] Allow config values to reference conf, env, syste…
Jul 21, 2016
cfa5ae8
[SPARK-16644][SQL] Aggregate should not propagate constraints contain…
cloud-fan Jul 21, 2016
1bf13ba
[MINOR][DOCS][STREAMING] Minor docfix schema of csv rather than parqu…
holdenk Jul 21, 2016
864b764
[SPARK-16226][SQL] Weaken JDBC isolation level to avoid locking when …
srowen Jul 21, 2016
8674054
[SPARK-16632][SQL] Use Spark requested schema to guide vectorized Par…
liancheng Jul 21, 2016
6203668
[SPARK-16640][SQL] Add codegen for Elt function
viirya Jul 21, 2016
69626ad
[SPARK-16632][SQL] Revert PR #14272: Respect Hive schema when merging…
liancheng Jul 21, 2016
235cb25
[SPARK-16194] Mesos Driver env vars
Jul 21, 2016
9abd99b
[SPARK-16656][SQL] Try to make CreateTableAsSelectSuite more stable
yhuai Jul 21, 2016
46f80a3
[SPARK-16334] Maintain single dictionary per row-batch in vectorized …
sameeragarwal Jul 21, 2016
df2c6d5
[SPARK-16287][SQL] Implement str_to_map SQL function
techaddict Jul 22, 2016
94f14b5
[SPARK-16556][SPARK-16559][SQL] Fix Two Bugs in Bucket Specification
gatorsmile Jul 22, 2016
e1bd70f
[SPARK-16287][HOTFIX][BUILD][SQL] Fix annotation argument needs to be…
jaceklaskowski Jul 22, 2016
2c72a44
[SPARK-16487][STREAMING] Fix some batches might not get marked as ful…
ahmed-mahran Jul 22, 2016
b4e16bd
[GIT] add pydev & Rstudio project file to gitignore list
WeichenXu123 Jul 22, 2016
6c56fff
[SPARK-16650] Improve documentation of spark.task.maxFailures
Jul 22, 2016
47f5b88
[SPARK-16651][PYSPARK][DOC] Make `withColumnRenamed/drop` description…
dongjoon-hyun Jul 22, 2016
e10b874
[SPARK-16622][SQL] Fix NullPointerException when the returned value o…
viirya Jul 23, 2016
25db516
[SPARK-16561][MLLIB] fix multivarOnlineSummary min/max bug
WeichenXu123 Jul 23, 2016
ab6e4ae
[SPARK-16662][PYSPARK][SQL] fix HiveContext warning bug
WeichenXu123 Jul 23, 2016
86c2752
[SPARK-16690][TEST] rename SQLTestUtils.withTempTable to withTempView
cloud-fan Jul 23, 2016
53b2456
[SPARK-16380][EXAMPLES] Update SQL examples and programming guide for…
liancheng Jul 23, 2016
e3c7039
[MINOR] Close old PRs that should be closed but have not been
srowen Jul 24, 2016
d6795c7
[SPARK-16515][SQL][FOLLOW-UP] Fix test `script` on OS X/Windows...
lw-lin Jul 24, 2016
cc1d2dc
[SPARK-16463][SQL] Support `truncate` option in Overwrite mode for JD…
dongjoon-hyun Jul 24, 2016
37bed97
[PYSPARK] add picklable SparseMatrix in pyspark.ml.common
WeichenXu123 Jul 24, 2016
23e047f
[SPARK-16416][CORE] force eager creation of loggers to avoid shutdown…
Jul 24, 2016
1221ce0
[SPARK-16645][SQL] rename CatalogStorageFormat.serdeProperties to pro…
cloud-fan Jul 25, 2016
daace60
[SPARK-5581][CORE] When writing sorted map output file, avoid open / …
bchocho Jul 25, 2016
468a3c3
[SPARK-16699][SQL] Fix performance bug in hash aggregate on long stri…
ooq Jul 25, 2016
68b4020
[SPARK-16648][SQL] Make ignoreNullsExpr a child expression of First a…
liancheng Jul 25, 2016
7ffd99e
[SPARK-16674][SQL] Avoid per-record type dispatch in JDBC when reading
HyukjinKwon Jul 25, 2016
d27d362
[SPARK-16660][SQL] CreateViewCommand should not take CatalogTable
cloud-fan Jul 25, 2016
64529b1
[SPARK-16691][SQL] move BucketSpec to catalyst module and use it in C…
cloud-fan Jul 25, 2016
d6a5217
[SPARK-16668][TEST] Test parquet reader for row groups containing bot…
sameeragarwal Jul 25, 2016
79826f3
[SPARK-16698][SQL] Field names having dots should be allowed for data…
HyukjinKwon Jul 25, 2016
7ea6d28
[SPARK-16703][SQL] Remove extra whitespace in SQL generation for wind…
liancheng Jul 25, 2016
b73defd
[SPARKR][DOCS] fix broken url in doc
felixcheung Jul 25, 2016
ad3708e
[SPARK-16653][ML][OPTIMIZER] update ANN convergence tolerance param d…
WeichenXu123 Jul 25, 2016
dd784a8
[SPARK-16685] Remove audit-release scripts.
rxin Jul 25, 2016
978cd5f
[SPARK-15271][MESOS] Allow force pulling executor docker images
philipphoffmann Jul 25, 2016
3b6e1d0
[SPARK-16485][DOC][ML] Fixed several inline formatting in ml features…
lins05 Jul 25, 2016
fc17121
Revert "[SPARK-15271][MESOS] Allow force pulling executor docker images"
JoshRosen Jul 25, 2016
cda4603
[SQL][DOC] Fix a default name for parquet compression
maropu Jul 25, 2016
f5ea7fe
[SPARK-16166][CORE] Also take off-heap memory usage into consideratio…
jerryshao Jul 25, 2016
12f490b
[SPARK-16715][TESTS] Fix a potential ExprId conflict for Subexpressio…
zsxwing Jul 25, 2016
c979c8b
[SPARK-14131][STREAMING] SQL Improved fix for avoiding potential dead…
tdas Jul 25, 2016
db36e1e
[SPARK-15590][WEBUI] Paginate Job Table in Jobs tab
nblintao Jul 26, 2016
e164a04
[SPARK-16722][TESTS] Fix a StreamingContext leak in StreamingContextS…
zsxwing Jul 26, 2016
3fc4566
[SPARK-16678][SPARK-16677][SQL] Fix two View-related bugs
gatorsmile Jul 26, 2016
ba0aade
Fix description of spark.speculation.quantile
nwbvt Jul 26, 2016
8a8d26f
[SPARK-16672][SQL] SQLBuilder should not raise exceptions on EXISTS q…
dongjoon-hyun Jul 26, 2016
f99e34e
[SPARK-16724] Expose DefinedByConstructorParams
marmbrus Jul 26, 2016
815f3ee
[SPARK-16633][SPARK-16642][SPARK-16721][SQL] Fixes three issues relat…
yhuai Jul 26, 2016
7b06a89
[SPARK-16686][SQL] Remove PushProjectThroughSample since it is handle…
viirya Jul 26, 2016
6959061
[SPARK-16706][SQL] support java map in encoder
cloud-fan Jul 26, 2016
03c2743
[TEST][STREAMING] Fix flaky Kafka rate controlling test
tdas Jul 26, 2016
3b2b785
[SPARK-16675][SQL] Avoid per-record type dispatch in JDBC when writing
HyukjinKwon Jul 26, 2016
4c96955
[SPARK-16697][ML][MLLIB] improve LDA submitMiniBatch method to avoid …
WeichenXu123 Jul 26, 2016
a2abb58
[SPARK-16663][SQL] desc table should be consistent between data sourc…
cloud-fan Jul 26, 2016
0869b3a
[SPARK-15271][MESOS] Allow force pulling executor docker images
philipphoffmann Jul 26, 2016
0b71d9a
[SPARK-15703][SCHEDULER][CORE][WEBUI] Make ListenerBus event queue si…
dhruve Jul 26, 2016
738b4cc
[SPARK-16524][SQL] Add RowBatch and RowBasedHashMapGenerator
ooq Jul 27, 2016
5b8e848
[SPARK-16621][SQL] Generate stable SQLs in SQLBuilder
dongjoon-hyun Jul 27, 2016
ef0ccbc
[SPARK-16729][SQL] Throw analysis exception for invalid date casts
petermaxlee Jul 27, 2016
3c3371b
[MINOR][ML] Fix some mistake in LinearRegression formula.
yanboliang Jul 27, 2016
045fc36
[MINOR][DOC][SQL] Fix two documents regarding size in bytes
viirya Jul 27, 2016
7e8279f
[SPARK-15254][DOC] Improve ML pipeline Cross Validation Scaladoc & PyDoc
krishnakalyan3 Jul 27, 2016
70f846a
[SPARK-5847][CORE] Allow for configuring MetricsSystem's use of app I…
markgrover Jul 27, 2016
bc4851a
[MINOR][DOC] missing keyword new
Jul 27, 2016
b14d7b5
[SPARK-16110][YARN][PYSPARK] Fix allowing python version to be specif…
KevinGrealish Jul 27, 2016
11d427c
[SPARK-16730][SQL] Implement function aliases for type casts
petermaxlee Jul 28, 2016
5c2ae79
[SPARK-15232][SQL] Add subquery SQL building tests to LogicalPlanToSQ…
dongjoon-hyun Jul 28, 2016
762366f
[SPARK-16552][SQL] Store the Inferred Schemas into External Catalog T…
gatorsmile Jul 28, 2016
9ade77c
[SPARK-16639][SQL] The query with having condition that contains grou…
viirya Jul 28, 2016
1178d61
[SPARK-16740][SQL] Fix Long overflow in LongToUnsafeRowMap
sylvinus Jul 28, 2016
3fd39b8
[SPARK-16764][SQL] Recommend disabling vectorized parquet reader on O…
sameeragarwal Jul 28, 2016
274f3b9
[SPARK-16772] Correct API doc references to PySpark classes + formatt…
nchammas Jul 28, 2016
d1d5069
[SPARK-16664][SQL] Fix persist call on Data frames with more than 200…
Jul 29, 2016
0557a45
[SPARK-16750][ML] Fix GaussianMixture training failed due to feature …
yanboliang Jul 29, 2016
04a2c07
[SPARK-16751] Upgrade derby to 10.12.1.1
a-roberts Jul 29, 2016
266b92f
[SPARK-16637] Unified containerizer
Jul 29, 2016
2c15323
[SPARK-16761][DOC][ML] Fix doc link in docs/ml-guide.md
sundapeng Jul 29, 2016
2182e43
[SPARK-16772][PYTHON][DOCS] Restore "datatype string" to Python API d…
nchammas Jul 29, 2016
bbc2475
[SPARK-16748][SQL] SparkExceptions during planning should not wrapped…
tdas Jul 30, 2016
0dc4310
[SPARK-16694][CORE] Use for/foreach rather than map for Unit expressi…
srowen Jul 30, 2016
bce354c
[SPARK-16696][ML][MLLIB] destroy KMeans bcNewCenters when loop finish…
WeichenXu123 Jul 30, 2016
a6290e5
[SPARK-16800][EXAMPLES][ML] Fix Java examples that fail to run due to…
BryanCutler Jul 30, 2016
957a8ab
[SPARK-16818] Exchange reuse incorrectly reuses scans over different …
ericl Jul 31, 2016
7c27d07
[SPARK-16812] Open up SparkILoop.getAddedJars
rxin Jul 31, 2016
064d91f
[SPARK-16813][SQL] Remove private[sql] and private[spark] from cataly…
rxin Jul 31, 2016
301fb0d
[SPARK-16731][SQL] use StructType in CatalogTable and remove CatalogC…
cloud-fan Aug 1, 2016
579fbcf
[SPARK-16805][SQL] Log timezone when query result does not match
rxin Aug 1, 2016
64d8f37
[SPARK-16726][SQL] Improve `Union/Intersect/Except` error messages on…
dongjoon-hyun Aug 1, 2016
2a0de7d
[SPARK-16485][DOC][ML] Remove useless latex in a log messge.
lins05 Aug 1, 2016
1e9b59b
[SPARK-16778][SQL][TRIVIAL] Fix deprecation warning with SQLContext
holdenk Aug 1, 2016
f93ad4f
[SPARK-16776][STREAMING] Replace deprecated API in KafkaTestUtils for…
HyukjinKwon Aug 1, 2016
338a98d
[SPARK-16791][SQL] cast struct with timestamp field fails
Aug 1, 2016
ab1e761
[SPARK-16774][SQL] Fix use of deprecated timestamp constructor & impr…
holdenk Aug 1, 2016
03d46aa
[SPARK-15869][STREAMING] Fix a potential NPE in StreamingJobProgressL…
zsxwing Aug 1, 2016
2eedc00
[SPARK-16828][SQL] remove MaxOf and MinOf
cloud-fan Aug 2, 2016
5184df0
[SPARK-16793][SQL] Set the temporary warehouse path to sc'conf in Tes…
jiangxb1987 Aug 2, 2016
10e1c0e
[SPARK-16734][EXAMPLES][SQL] Revise examples of all language bindings
liancheng Aug 2, 2016
a1ff72e
[SPARK-16850][SQL] Improve type checking error message for greatest/l…
petermaxlee Aug 2, 2016
d9e0919
[SPARK-16851][ML] Incorrect threshould length in 'setThresholds()' ev…
zhengruifeng Aug 2, 2016
dd8514f
[SPARK-16558][EXAMPLES][MLLIB] examples/mllib/LDAExample should use M…
yinxusen Aug 2, 2016
511dede
[SPARK-15541] Casting ConcurrentHashMap to ConcurrentMap (master branch)
Aug 2, 2016
36827dd
[SPARK-16822][DOC] Support latex in scaladoc.
lins05 Aug 2, 2016
1dab63d
[SPARK-16837][SQL] TimeWindow incorrectly drops slideDuration in cons…
tmagrino Aug 2, 2016
146001a
[SPARK-16062] [SPARK-15989] [SQL] Fix two bugs of Python-only UDTs
viirya Aug 2, 2016
2330f3e
[SPARK-16836][SQL] Add support for CURRENT_DATE/CURRENT_TIMESTAMP lit…
hvanhovell Aug 2, 2016
cbdff49
[SPARK-16816] Modify java example which is also reflect in documentat…
phalodi Aug 2, 2016
a9beeaa
[SPARK-16855][SQL] move Greatest and Least from conditionalExpression…
cloud-fan Aug 2, 2016
e9fc0b6
[SPARK-16787] SparkContext.addFile() should not throw if called twice…
JoshRosen Aug 2, 2016
b73a570
[SPARK-16858][SQL][TEST] Removal of TestHiveSharedState
gatorsmile Aug 2, 2016
3861273
[SPARK-16796][WEB UI] Visible passwords on Spark environment page
Devian-ua Aug 2, 2016
ae22628
[SQL][MINOR] use stricter type parameter to make it clear that parque…
cloud-fan Aug 3, 2016
639df04
[SPARK-16831][PYTHON] Fixed bug in CrossValidator.avgMetrics
pkch Aug 3, 2016
b55f343
[SPARK-16714][SPARK-16735][SPARK-16646] array, map, greatest, least's…
cloud-fan Aug 3, 2016
e6f226c
[SPARK-16596] [SQL] Refactor DataSourceScanExec to do partition disco…
ericl Aug 3, 2016
685b08e
[SPARK-14204][SQL] register driverClass rather than user-specified class
mchalek Aug 3, 2016
4775eb4
[SPARK-16770][BUILD] Fix JLine dependency management and version (Sca…
stsc-pentasys Aug 4, 2016
c5eb1df
[SPARK-16814][SQL] Fix deprecated parquet constructor usage
holdenk Aug 4, 2016
583d91a
[SPARK-16873][CORE] Fix SpillReader NPE when spillFile has no data
sharkdtu Aug 4, 2016
780c722
[MINOR][SQL] Fix minor formatting issue of SortAggregateExec.toString
liancheng Aug 4, 2016
27e815c
[SPARK-16888][SQL] Implements eval method for expression AssertNotNull
clockfly Aug 4, 2016
43f4fd6
[SPARK-16867][SQL] createTable and alterTable in ExternalCatalog shou…
cloud-fan Aug 4, 2016
9d7a474
[SPARK-16853][SQL] fixes encoder error in DataSet typed select
clockfly Aug 4, 2016
9d4e621
[SPARK-16802] [SQL] fix overflow in LongToUnsafeRowMap
Aug 4, 2016
ac2a26d
[SPARK-16884] Move DataSourceScanExec out of ExistingRDD.scala file
ericl Aug 4, 2016
be8ea4b
[SPARK-16875][SQL] Add args checking for DataSet randomSplit and sample
zhengruifeng Aug 4, 2016
462784f
[SPARK-16880][ML][MLLIB] make ann training data persisted if needed
WeichenXu123 Aug 4, 2016
1d78157
[SPARK-16877][BUILD] Add rules for preventing to use Java annotations…
HyukjinKwon Aug 4, 2016
0e2e5d7
[SPARK-16863][ML] ProbabilisticClassifier.fit check threshoulds' length
zhengruifeng Aug 4, 2016
9c15d07
[SPARK-15074][SHUFFLE] Cache shuffle index file to speedup shuffle fetch
Aug 4, 2016
d91c675
[HOTFIX] Remove unnecessary imports from #12944 that broke build
JoshRosen Aug 4, 2016
53e766c
MAINTENANCE. Cleaning up stale PRs.
Aug 4, 2016
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
12 changes: 12 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)


## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)


(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

104 changes: 56 additions & 48 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,76 +1,84 @@
*~
*.#*
*#*#
*.swp
*.ipr
*.#*
*.iml
*.ipr
*.iws
*.pyc
*.pyo
*.swp
*~
.DS_Store
.cache
.classpath
.ensime
.ensime_cache/
.ensime_lucene
.generated-mima*
.idea/
.idea_modules/
build/*.jar
.project
.pydevproject
.scala_dependencies
.settings
.cache
cache
.generated-mima*
work/
out/
.DS_Store
third_party/libmesos.so
third_party/libmesos.dylib
/lib/
R-unit-tests.log
R/unit-tests.out
build/*.jar
build/apache-maven*
build/zinc*
build/scala*
conf/java-opts
conf/*.sh
build/zinc*
cache
checkpoint
conf/*.cmd
conf/*.properties
conf/*.conf
conf/*.properties
conf/*.sh
conf/*.xml
conf/java-opts
conf/slaves
dependency-reduced-pom.xml
derby.log
dev/create-release/*final
dev/create-release/*txt
dist/
docs/_site
docs/api
target/
reports/
.project
.classpath
.scala_dependencies
lib_managed/
src_managed/
lint-r-report.log
log/
logs/
out/
project/boot/
project/plugins/project/build.properties
project/build/target/
project/plugins/target/
project/plugins/lib_managed/
project/plugins/project/build.properties
project/plugins/src_managed/
logs/
log/
project/plugins/target/
python/lib/pyspark.zip
reports/
scalastyle-on-compile.generated.xml
scalastyle-output.xml
scalastyle.txt
spark-*-bin-*.tgz
spark-tests.log
src_managed/
streaming-tests.log
dependency-reduced-pom.xml
.ensime
.ensime_lucene
checkpoint
derby.log
dist/
dev/create-release/*txt
dev/create-release/*final
spark-*-bin-*.tgz
target/
unit-tests.log
/lib/
ec2/lib/
rat-results.txt
scalastyle.txt
scalastyle-output.xml
R-unit-tests.log
R/unit-tests.out
python/lib/pyspark.zip
lint-r-report.log
work/

# For Hive
metastore_db/
metastore/
warehouse/
TempStatsStore/
metastore/
metastore_db/
sql/hive-thriftserver/test_warehouses
warehouse/
spark-warehouse/

# For R session data
.RData
.RHistory
.Rhistory
*.Rproj
*.Rproj.*

85 changes: 0 additions & 85 deletions .rat-excludes

This file was deleted.

51 changes: 51 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Spark provides this Travis CI configuration file to help contributors
# check Scala/Java style conformance and JDK7/8 compilation easily
# during their preparing pull requests.
# - Scalastyle is executed during `maven install` implicitly.
# - Java Checkstyle is executed by `lint-java`.
# See the related discussion here.
# https://github.com/apache/spark/pull/12980

# 1. Choose OS (Ubuntu 14.04.3 LTS Server Edition 64bit, ~2 CORE, 7.5GB RAM)
sudo: required
dist: trusty

# 2. Choose language and target JDKs for parallel builds.
language: java
jdk:
- oraclejdk7
- oraclejdk8

# 3. Setup cache directory for SBT and Maven.
cache:
directories:
- $HOME/.sbt
- $HOME/.m2

# 4. Turn off notifications.
notifications:
email: false

# 5. Run maven install before running lint-java.
install:
- export MAVEN_SKIP_RC=1
- build/mvn -T 4 -q -DskipTests -Pyarn -Phadoop-2.3 -Pkinesis-asl -Phive -Phive-thriftserver install

# 6. Run lint-java.
script:
- dev/lint-java
37 changes: 21 additions & 16 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
Expand Down Expand Up @@ -237,9 +236,9 @@ The following components are provided under a BSD-style license. See project lin
The text of each license is also included at licenses/LICENSE-[project].txt.

(BSD 3 Clause) netlib core (com.github.fommil.netlib:core:1.1.2 - https://github.com/fommil/netlib-java/core)
(BSD 3 Clause) JPMML-Model (org.jpmml:pmml-model:1.1.15 - https://github.com/jpmml/jpmml-model)
(BSD 3-clause style license) jblas (org.jblas:jblas:1.2.4 - http://jblas.org/)
(BSD 3 Clause) JPMML-Model (org.jpmml:pmml-model:1.2.7 - https://github.com/jpmml/jpmml-model)
(BSD License) AntLR Parser Generator (antlr:antlr:2.7.7 - http://www.antlr.org/)
(BSD License) ANTLR 4.5.2-1 (org.antlr:antlr4:4.5.2-1 - http://wwww.antlr.org/)
(BSD licence) ANTLR ST4 4.0.4 (org.antlr:ST4:4.0.4 - http://www.stringtemplate.org)
(BSD licence) ANTLR StringTemplate (org.antlr:stringtemplate:3.2.1 - http://www.stringtemplate.org)
(BSD License) Javolution (javolution:javolution:5.5.1 - http://javolution.org)
Expand All @@ -250,22 +249,21 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(Interpreter classes (all .scala files in repl/src/main/scala
except for Main.Scala, SparkHelper.scala and ExecutorClassLoader.scala),
and for SerializableMapWrapper in JavaUtils.scala)
(BSD-like) Scala Actors library (org.scala-lang:scala-actors:2.10.5 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-compiler:2.10.5 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-reflect:2.10.5 - http://www.scala-lang.org/)
(BSD-like) Scala Library (org.scala-lang:scala-library:2.10.5 - http://www.scala-lang.org/)
(BSD-like) Scalap (org.scala-lang:scalap:2.10.5 - http://www.scala-lang.org/)
(BSD-style) scalacheck (org.scalacheck:scalacheck_2.10:1.10.0 - http://www.scalacheck.org)
(BSD-style) spire (org.spire-math:spire_2.10:0.7.1 - http://spire-math.org)
(BSD-style) spire-macros (org.spire-math:spire-macros_2.10:0.7.1 - http://spire-math.org)
(New BSD License) Kryo (com.esotericsoftware.kryo:kryo:2.21 - http://code.google.com/p/kryo/)
(New BSD License) MinLog (com.esotericsoftware.minlog:minlog:1.2 - http://code.google.com/p/minlog/)
(New BSD License) ReflectASM (com.esotericsoftware.reflectasm:reflectasm:1.07 - http://code.google.com/p/reflectasm/)
(BSD-like) Scala Actors library (org.scala-lang:scala-actors:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-compiler:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-reflect:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Library (org.scala-lang:scala-library:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scalap (org.scala-lang:scalap:2.11.7 - http://www.scala-lang.org/)
(BSD-style) scalacheck (org.scalacheck:scalacheck_2.11:1.10.0 - http://www.scalacheck.org)
(BSD-style) spire (org.spire-math:spire_2.11:0.7.1 - http://spire-math.org)
(BSD-style) spire-macros (org.spire-math:spire-macros_2.11:0.7.1 - http://spire-math.org)
(New BSD License) Kryo (com.esotericsoftware:kryo:3.0.3 - https://github.com/EsotericSoftware/kryo)
(New BSD License) MinLog (com.esotericsoftware:minlog:1.3.0 - https://github.com/EsotericSoftware/minlog)
(New BSD license) Protocol Buffer Java API (com.google.protobuf:protobuf-java:2.5.0 - http://code.google.com/p/protobuf)
(New BSD license) Protocol Buffer Java API (org.spark-project.protobuf:protobuf-java:2.4.1-shaded - http://code.google.com/p/protobuf)
(The BSD License) Fortran to Java ARPACK (net.sourceforge.f2j:arpack_combined_all:0.1 - http://f2j.sourceforge.net)
(The BSD License) xmlenc Library (xmlenc:xmlenc:0.52 - http://xmlenc.sourceforge.net)
(The New BSD License) Py4J (net.sf.py4j:py4j:0.9 - http://py4j.sourceforge.net/)
(The New BSD License) Py4J (net.sf.py4j:py4j:0.10.1 - http://py4j.sourceforge.net/)
(Two-clause BSD-style license) JUnit-Interface (com.novocode:junit-interface:0.10 - http://github.com/szeiger/junit-interface/)
(BSD licence) sbt and sbt-launch-lib.bash
(BSD 3 Clause) d3.min.js (https://github.com/mbostock/d3/blob/master/LICENSE)
Expand All @@ -284,11 +282,18 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(MIT License) SLF4J API Module (org.slf4j:slf4j-api:1.7.5 - http://www.slf4j.org)
(MIT License) SLF4J LOG4J-12 Binding (org.slf4j:slf4j-log4j12:1.7.5 - http://www.slf4j.org)
(MIT License) pyrolite (org.spark-project:pyrolite:2.0.1 - http://pythonhosted.org/Pyro4/)
(MIT License) scopt (com.github.scopt:scopt_2.10:3.2.0 - https://github.com/scopt/scopt)
(MIT License) scopt (com.github.scopt:scopt_2.11:3.2.0 - https://github.com/scopt/scopt)
(The MIT License) Mockito (org.mockito:mockito-core:1.9.5 - http://www.mockito.org)
(MIT License) jquery (https://jquery.org/license/)
(MIT License) AnchorJS (https://github.com/bryanbraun/anchorjs)
(MIT License) graphlib-dot (https://github.com/cpettitt/graphlib-dot)
(MIT License) dagre-d3 (https://github.com/cpettitt/dagre-d3)
(MIT License) sorttable (https://github.com/stuartlangridge/sorttable)
(MIT License) boto (https://github.com/boto/boto/blob/develop/LICENSE)
(MIT License) datatables (http://datatables.net/license)
(MIT License) mustache (https://github.com/mustache/mustache/blob/master/LICENSE)
(MIT License) cookies (http://code.google.com/p/cookies/wiki/License)
(MIT License) blockUI (http://jquery.malsup.com/block/)
(MIT License) RowsGroup (http://datatables.net/license/mit)
(MIT License) jsonFormatter (http://www.jqueryscript.net/other/jQuery-Plugin-For-Pretty-JSON-Formatting-jsonFormatter.html)
(MIT License) modernizr (https://github.com/Modernizr/Modernizr/blob/master/LICENSE)
Loading