You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/running-on-yarn.md
+11-13Lines changed: 11 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,14 +16,12 @@ containers used by the application use the same configuration. If the configurat
16
16
Java system properties or environment variables not managed by YARN, they should also be set in the
17
17
Spark application's configuration (driver, executors, and the AM when running in client mode).
18
18
19
-
There are two deploy modes that can be used to launch Spark applications on YARN. In yarn-cluster mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application. In yarn-client mode, the driver runs in the client process, and the application master is only used for requesting resources from YARN.
20
-
(Default: `--deploy-mode client`)
19
+
There are two deploy modes that can be used to launch Spark applications on YARN. In `yarn-cluster` mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application. In `yarn-client` mode, the driver runs in the client process, and the application master is only used for requesting resources from YARN.
21
20
22
-
Unlike in Spark standalone and Mesos mode, in which the master's address is specified in the "master" parameter, in YARN mode the ResourceManager's address is picked up from the Hadoop configuration. Thus, the master parameter is yarn. For a specific yarn deployment, use --deploy-mode to specify yarn-cluster or yarn-client.
21
+
Unlike in Spark standalone and Mesos mode, in which the master's address is specified in the `--master` parameter, in YARN mode the ResourceManager's address is picked up from the Hadoop configuration. Thus, the `--master` parameter is `yarn-client` or `yarn-cluster`.
22
+
To launch a Spark application in `yarn-cluster` mode:
23
23
24
-
To launch a Spark application in yarn-cluster mode:
The above starts a YARN client program which starts the default Application Master. Then SparkPi will be run as a child thread of Application Master. The client will periodically poll the Application Master for status updates and display them in the console. The client will exit once your application has finished running. Refer to the "Debugging your Application" section below for how to see driver and executor logs.
41
39
42
-
To launch a Spark application in yarn-client mode, do the same, but replace "yarn-cluster" with "yarn-client". To run spark-shell:
40
+
To launch a Spark application in `yarn-client` mode, do the same, but replace `yarn-cluster` with `yarn-client`. To run spark-shell:
43
41
44
42
$ ./bin/spark-shell --master yarn-client
45
43
46
44
## Adding Other JARs
47
45
48
-
In yarn-cluster mode, the driver runs on a different machine than the client, so `SparkContext.addJar` won't work out of the box with files that are local to the client. To make files on the client available to `SparkContext.addJar`, include them with the `--jars` option in the launch command.
46
+
In `yarn-cluster` mode, the driver runs on a different machine than the client, so `SparkContext.addJar` won't work out of the box with files that are local to the client. To make files on the client available to `SparkContext.addJar`, include them with the `--jars` option in the launch command.
49
47
50
48
$ ./bin/spark-submit --class my.main.Class \
51
49
--master yarn-cluster \
@@ -129,8 +127,8 @@ If you need a reference to the proper location to put log files in the YARN so t
129
127
<td><code>spark.yarn.am.waitTime</code></td>
130
128
<td>100s</td>
131
129
<td>
132
-
In yarn-cluster mode, time for the application master to wait for the
133
-
SparkContext to be initialized. In yarn-client mode, time for the application master to wait
130
+
In `yarn-cluster` mode, time for the application master to wait for the
131
+
SparkContext to be initialized. In `yarn-client` mode, time for the application master to wait
134
132
for the driver to connect to it.
135
133
</td>
136
134
</tr>
@@ -255,8 +253,8 @@ If you need a reference to the proper location to put log files in the YARN so t
255
253
<td>
256
254
Add the environment variable specified by <code>EnvironmentVariableName</code> to the
257
255
Application Master process launched on YARN. The user can specify multiple of
258
-
these and to set multiple environment variables. In yarn-cluster mode this controls
259
-
the environment of the SPARK driver and in yarn-client mode it only controls
256
+
these and to set multiple environment variables. In `yarn-cluster` mode this controls
257
+
the environment of the SPARK driver and in `yarn-client` mode it only controls
260
258
the environment of the executor launcher.
261
259
</td>
262
260
</tr>
@@ -272,7 +270,7 @@ If you need a reference to the proper location to put log files in the YARN so t
272
270
<td>(none)</td>
273
271
<td>
274
272
A string of extra JVM options to pass to the YARN Application Master in client mode.
275
-
In cluster mode, use spark.driver.extraJavaOptions instead.
273
+
In cluster mode, use `spark.driver.extraJavaOptions` instead.
0 commit comments