Skip to content

Commit e2e9c64

Browse files
authored
Merge pull request apache#117 from mesosphere/sparkr-docs
Added documentation for SparkR
2 parents 20a00f8 + 7757a70 commit e2e9c64

File tree

4 files changed

+18
-7
lines changed

4 files changed

+18
-7
lines changed

docs/limitations.md

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,18 +5,15 @@ feature_maturity: stable
55
enterprise: 'no'
66
---
77

8-
* DC/OS Spark only supports submitting jars and Python scripts. It
9-
does not support R.
10-
118
* Mesosphere does not provide support for Spark app development,
12-
such as writing a Python app to process data from Kafka or writing
9+
such as writing a Python app to process data from Kafka or writing
1310
Scala code to process data from HDFS.
1411

1512
* Spark jobs run in Docker containers. The first time you run a
1613
Spark job on a node, it might take longer than you expect because of
1714
the `docker pull`.
1815

1916
* DC/OS Spark only supports running the Spark shell from within a
20-
DC/OS cluster. See the Spark Shell section for more information.
17+
DC/OS cluster. See the Spark Shell section for more information.
2118
For interactive analytics, we recommend Zeppelin, which supports visualizations and dynamic
2219
dependency management.

docs/quick-start.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,10 @@ enterprise: 'no'
1717

1818
$ dcos spark run --submit-args="https://downloads.mesosphere.com/spark/examples/pi.py 30"
1919

20+
1. Run an R Spark job:
21+
22+
$ dcos spark run --submit-args="https://downloads.mesosphere.com/spark/examples/dataframe.R"
23+
2024
1. View your job:
2125

2226
Visit the Spark cluster dispatcher at

docs/run-job.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,10 @@ more][13].
1212

1313
$ dcos spark run --submit-args=`--class MySampleClass http://external.website/mysparkapp.jar 30`
1414

15-
1615
$ dcos spark run --submit-args="--py-files mydependency.py http://external.website/mysparkapp.py 30"
1716

17+
$ dcos spark run --submit-args="http://external.website/mysparkapp.R"
18+
1819
`dcos spark run` is a thin wrapper around the standard Spark
1920
`spark-submit` script. You can submit arbitrary pass-through options
2021
to this script via the `--submit-args` options.

docs/spark-shell.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ enterprise: 'no'
77
# Interactive Spark Shell
88

99
You can run Spark commands interactively in the Spark shell. The Spark shell is available
10-
in either Scala or Python.
10+
in either Scala, Python, or R.
1111

1212
1. SSH into a node in the DC/OS cluster. [Learn how to SSH into your cluster and get the agent node ID](https://dcos.io/docs/latest/administration/access-node/sshcluster/).
1313

@@ -27,6 +27,10 @@ in either Scala or Python.
2727

2828
$ ./bin/pyspark --master mesos://<internal-master-ip>:5050 --conf spark.mesos.executor.docker.image=mesosphere/spark:1.0.4-2.0.1 --conf spark.mesos.executor.home=/opt/spark/dist
2929

30+
Or, run the R Spark shell.
31+
32+
$ ./bin/sparkR --master mesos://<internal-master-ip>:5050 --conf spark.mesos.executor.docker.image=mesosphere/spark:1.0.7-2.1.0-hadoop-2.6 --conf spark.mesos.executor.home=/opt/spark/dist
33+
3034
1. Run Spark commands interactively.
3135

3236
In the Scala shell:
@@ -38,3 +42,8 @@ in either Scala or Python.
3842

3943
$ textFile = sc.textFile("/opt/spark/dist/README.md")
4044
$ textFile.count()
45+
46+
In the R shell:
47+
48+
$ df <- as.DataFrame(faithful)
49+
$ head(df)

0 commit comments

Comments
 (0)