Closed
Description
When I try to run a Pyspark job on a Cloud Dataproc cluster, I get the following error:
ImportError: No module named gcloud
I have gcloud installed on all the nodes in the cluster (the master, as well as the worker nodes). Here are the version numbers -
Google Cloud SDK 111.0.0
bq 2.0.24
bq-nix 2.0.24
core 2016.05.20
core-nix 2016.05.05
gcloud
gsutil 4.19
gsutil-nix 4.19
However, when I submit the job after I ssh into the master node and run $ spark-submit filename.py
, it runs perfectly fine.
Activity