File tree Expand file tree Collapse file tree 3 files changed +11
-7
lines changed Expand file tree Collapse file tree 3 files changed +11
-7
lines changed Original file line number Diff line number Diff line change @@ -19,7 +19,7 @@ From the spark instance, you could reach the MongoDB instance using `mongodb` ho
19
19
You can find a small dataset example in ` /home/ubuntu/times.json ` which you can load using [ initDocuments.scala] ( spark/files/initDocuments.scala ) :
20
20
21
21
```
22
- ${HOME}/spark-${SPARK_VERSION}- bin-hadoop${HADOOP_VERSION}/bin/ spark-shell --conf "spark.mongodb.input.uri=mongodb://mongodb:27017/spark.times" --conf "spark.mongodb.output.uri=mongodb://mongodb/spark.output" --packages org.mongodb.spark:mongo-spark-connector_2.10:1.0.0 -i ./initDocuments.scala
22
+ ${SPARK_HOME}/ bin/ spark-shell --conf "spark.mongodb.input.uri=mongodb://mongodb:27017/spark.times" --conf "spark.mongodb.output.uri=mongodb://mongodb/spark.output" --packages org.mongodb.spark:mongo-spark-connector_${SCALA_VERSION}:${MONGO_SPARK_VERSION} -i ./initDocuments.scala
23
23
```
24
24
25
25
@@ -28,7 +28,7 @@ For examples, please see [reduceByKey.scala](spark/files/reduceByKey.scala) to q
28
28
Run the ` spark shell ` by executing:
29
29
30
30
``` sh
31
- ${HOME} /spark- ${SPARK_VERSION} - bin-hadoop ${HADOOP_VERSION} /bin/ spark-shell --conf " spark.mongodb.input.uri=mongodb://mongodb:27017/spark.times" --conf " spark.mongodb.output.uri=mongodb://mongodb/spark.output" --packages org.mongodb.spark:mongo-spark-connector_2.10:1.0.0
31
+ ${SPARK_HOME} / bin/ spark-shell --conf " spark.mongodb.input.uri=mongodb://mongodb:27017/spark.times" --conf " spark.mongodb.output.uri=mongodb://mongodb/spark.output" --packages org.mongodb.spark:mongo-spark-connector_ ${SCALA_VERSION} : ${MONGO_SPARK_VERSION}
32
32
```
33
33
34
34
You can also append ` -i <file.scala> ` to execute a scala file via the spark shell.
Original file line number Diff line number Diff line change @@ -13,10 +13,12 @@ ENV HOME /home/ubuntu
13
13
ENV SPARK_VERSION 1.6.2
14
14
ENV HADOOP_VERSION 2.6
15
15
ENV MONGO_SPARK_VERSION 1.0.0
16
- ENV SCALA_VERSION 2.11
16
+ ENV SCALA_VERSION 2.10
17
17
18
18
WORKDIR ${HOME}
19
19
20
+ ENV ${HOME}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}
21
+
20
22
COPY files/times.json /home/ubuntu/times.json
21
23
COPY files/readme.txt /home/ubuntu/readme.txt
22
24
COPY files/reduceByKey.scala /home/ubuntu/reduceByKey.scala
@@ -31,5 +33,3 @@ tar xvf spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz
31
33
32
34
RUN rm -fv spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz
33
35
34
- # Run single node of spark
35
- RUN ${HOME}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}/sbin/start-master.sh
Original file line number Diff line number Diff line change 4
4
mongoimport -h <mongodb ip> -d spark -c times ./times.json
5
5
6
6
# Or you can just use initDocuments.scala to import using Spark itself
7
- ${HOME}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION }/bin/spark-shell --conf "spark.mongodb.input.uri=mongodb://mongodb:27017/spark.times" --conf "spark.mongodb.output.uri=mongodb://mongodb/spark.output" --packages org.mongodb.spark:mongo-spark-connector_2.10:1.0.0 -i ./initDocuments.scala
7
+ ${SPARK_HOME }/bin/spark-shell --conf "spark.mongodb.input.uri=mongodb://mongodb:27017/spark.times" --conf "spark.mongodb.output.uri=mongodb://mongodb/spark.output" --packages org.mongodb.spark:mongo-spark-connector_2.10:1.0.0 -i ./initDocuments.scala
8
8
9
9
# Run spark-shell
10
- ${HOME}/spark-${SPARK_VERSION}- bin-hadoop${HADOOP_VERSION}/bin/ spark-shell --conf "spark.mongodb.input.uri=mongodb://mongodb:27017/spark.times" --conf "spark.mongodb.output.uri=mongodb://mongodb:27107/spark.output" --packages org.mongodb.spark:mongo-spark-connector_2.10:1.0.0
10
+ ${SPARK_HOME}/ bin/ spark-shell --conf "spark.mongodb.input.uri=mongodb://mongodb:27017/spark.times" --conf "spark.mongodb.output.uri=mongodb://mongodb:27107/spark.output" --packages org.mongodb.spark:mongo-spark-connector_${SCALA_VERSION}:${MONGO_SPARK_VERSION}
11
11
12
12
# Or you can run scala file through the shell by specifying `-i <file.scala>`
13
+
14
+ # start 1 master/worker
15
+ ${SPARK_HOME}/sbin/start-master.sh
16
+ ${SPARK_HOME}/sbin/start-slave.sh spark://spark:7077
You can’t perform that action at this time.
0 commit comments