Merge pull request apache#117 from mesosphere/sparkr-docs

susanxhuynh · web-flow · commit e2e9c6464523 · 2017-01-30T13:01:33.000-08:00
Added documentation for SparkR
diff --git a/docs/limitations.md b/docs/limitations.md
@@ -5,18 +5,15 @@ feature_maturity: stable
 enterprise: 'no'
 ---
 
-*   DC/OS Spark only supports submitting jars and Python scripts. It
-does not support R.
-
 *   Mesosphere does not provide support for Spark app development,
-such as writing a Python app to process data from Kafka or writing 
+such as writing a Python app to process data from Kafka or writing
 Scala code to process data from HDFS.
 
 *   Spark jobs run in Docker containers. The first time you run a
 Spark job on a node, it might take longer than you expect because of
 the `docker pull`.
 
 *   DC/OS Spark only supports running the Spark shell from within a
-DC/OS cluster. See the Spark Shell section for more information. 
+DC/OS cluster. See the Spark Shell section for more information.
 For interactive analytics, we recommend Zeppelin, which supports visualizations and dynamic
 dependency management.
diff --git a/docs/quick-start.md b/docs/quick-start.md
@@ -17,6 +17,10 @@ enterprise: 'no'
 
         $ dcos spark run --submit-args="https://downloads.mesosphere.com/spark/examples/pi.py 30"
 
+1.  Run an R Spark job:
+
+        $ dcos spark run --submit-args="https://downloads.mesosphere.com/spark/examples/dataframe.R"
+
 1.  View your job:
 
     Visit the Spark cluster dispatcher at
diff --git a/docs/run-job.md b/docs/run-job.md
@@ -12,9 +12,10 @@ more][13].
 
         $ dcos spark run --submit-args=`--class MySampleClass http://external.website/mysparkapp.jar 30`
 
-
         $ dcos spark run --submit-args="--py-files mydependency.py http://external.website/mysparkapp.py 30"
 
+        $ dcos spark run --submit-args="http://external.website/mysparkapp.R"
+
     `dcos spark run` is a thin wrapper around the standard Spark
     `spark-submit` script. You can submit arbitrary pass-through options
     to this script via the `--submit-args` options.
diff --git a/docs/spark-shell.md b/docs/spark-shell.md
@@ -7,7 +7,7 @@ enterprise: 'no'
 # Interactive Spark Shell
 
 You can run Spark commands interactively in the Spark shell. The Spark shell is available
-in either Scala or Python.
+in either Scala, Python, or R.
 
 1. SSH into a node in the DC/OS cluster. [Learn how to SSH into your cluster and get the agent node ID](https://dcos.io/docs/latest/administration/access-node/sshcluster/).
 
@@ -27,6 +27,10 @@ in either Scala or Python.
 
         $ ./bin/pyspark --master mesos://<internal-master-ip>:5050 --conf spark.mesos.executor.docker.image=mesosphere/spark:1.0.4-2.0.1 --conf spark.mesos.executor.home=/opt/spark/dist
 
+    Or, run the R Spark shell.
+
+        $ ./bin/sparkR --master mesos://<internal-master-ip>:5050 --conf spark.mesos.executor.docker.image=mesosphere/spark:1.0.7-2.1.0-hadoop-2.6 --conf spark.mesos.executor.home=/opt/spark/dist
+
 1. Run Spark commands interactively.
 
     In the Scala shell:
@@ -38,3 +42,8 @@ in either Scala or Python.
 
         $ textFile = sc.textFile("/opt/spark/dist/README.md")
         $ textFile.count()
+
+    In the R shell:
+
+        $ df <- as.DataFrame(faithful)
+        $ head(df)