Skip to content
This repository was archived by the owner on Feb 3, 2021. It is now read-only.

Feature: Support for multiple custom scripts and master only scrips #93

Merged
merged 25 commits into from
Oct 2, 2017
Merged

Feature: Support for multiple custom scripts and master only scrips #93

merged 25 commits into from
Oct 2, 2017

Conversation

jafreck
Copy link
Member

@jafreck jafreck commented Sep 22, 2017

Fix #91
Fix #89

@timotheeguerin timotheeguerin changed the title Feature/granular scripts Feature: Support for multiple custom scripts and master only scrips Sep 22, 2017
echo "This is a custom script running on just the master!"
fi

echo "This is a custom script running all workers!"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all workers and the master

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: running on*

@@ -80,24 +80,25 @@ def docker_run_cmd(docker_repo: str = None) -> str:
def generate_cluster_start_task(
cluster_id: str,
zip_resource_file: batch_models.ResourceFile,
custom_script: str = None,
custom_scripts: list = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List[str] I think is better

@jiata
Copy link
Contributor

jiata commented Sep 22, 2017

this is not really what i had in mind from a design perspective

Instead of using the "IS_MASTER" environment variable and make users write logic in bash (ex. if Master Node), i was thinking more along the lines of having the user specify which is a master-node-only script and which is a every-node script.

example:
azb spark cluster create --id my_cluster \
--master-script custom-script-1.sh \
--worker-script custom-script-2.sh

(maybe "worker-script" isn't the best name)

Intuitively i think this is a better experience as i doesn't require users to know about the environment variable "IS_MASTER" and write logic around that in bash.

@paselem what do you think

@jiata
Copy link
Contributor

jiata commented Sep 22, 2017

also, should #60 be part of this feature? For many cases, custom script on master is not useful without the ability to open custom ports.

@timotheeguerin
Copy link
Member

timotheeguerin commented Sep 22, 2017

I think IS_MASTER give the user more flexibility to run a custom script however they might want.

If there is something that needs to setup on both worker and master with an extra setup on a master this is easily done this way.

You could maybe have both master-script(only on the master) and custom-script, which runs on every node but has access to the IS_MASTER for advanced configuration

@jiata
Copy link
Contributor

jiata commented Sep 22, 2017

Yeah, i like that idea.

but what are the scenarios you had in mind where you would need that flexibility of IS_MASTER?

Separating makes sense to me just in terms of the scenarios that i can think of:

  • ADLS, WASB, S3, custom package installs (run on all nodes)
  • livy, jupyter, spark magic, r-studio-server (only on master node)

@paselem
Copy link
Contributor

paselem commented Sep 22, 2017

@jiata - I was thinking more along the lines of what you specified except just always point to a directory. If there are files there, we would automatically run them. That is a pretty common practice for lots of over linux-like tooling and OS settings. We could have two directories, one for master scripts, one for non-master scripts. This would alleviate users from having to specify the if condition in the script which is really an 'us' problem.

Also, another issue that might come up is whether or not ordering matters. I'm not sure how someone would specify that and how we would guarantee it. Possibly register a collection of scripts in the config file or (as per Jacob's suggestion) just write an uber script that calls the other scripts in the order it wants.

@jiata
Copy link
Contributor

jiata commented Sep 23, 2017

[ignore this comment - this is handled in #60]

I think we should also add the ability to open custom ports (on cluster create) as part of this PR.

custom_scripts:
- script: livy_script.sh
  is_master: true
  ports: [8998]
- script: jupyter_script.sh
  is_master: true
  ports: [8888]

Something like this ^

Copy link
Contributor

@jiata jiata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the cmd line experience? is that defined at all? or do people have to use the cluster.yaml to run complex custom scripts

You can specify the location of custom scripts on your local machine in `.thundebolt/cluster.yaml`. If you do not have a `.thunderbolt/` directory in you current working directory, run `azb spark init` or see [Getting Started](./00-getting-started). Note that the path can be absolute or relative to your current working directory.

The custom scripts can be configured to run on the Spark master only, the Spark workers only, or all nodes in the cluster (Please note that by default, the Spark master node is also a Spark worker). For example, the following custom-script configuration will run 3 custom scripts:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe worth pointing out here that the scripts will be executed in the order that the user defines it in the cluster.yaml

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do that in the following paragraph, but it might be more clear to put that info here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i noticed - what you have below is useful i think. But i think it could be stated more explicitly, like "scripts will execute in the order that they are listed".

@jafreck
Copy link
Member Author

jafreck commented Sep 27, 2017

The command line experience is broken as of the latest commit (this still needs to be fixed). I think it should probably be removed entirely and custom scripts should only be supported through cluster.yaml.

@Azure Azure deleted a comment from msftclas Sep 27, 2017
# # optional custom scripts to run on the Spark master, Spark worker or all nodes in the cluster
# custom_script:
# - script: </path/to/script.sh or /path/to/script/directory/>
# location: <master/worker/all-nodes>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename 'location' to 'runOn'

echo "This is a custom script running on just the master!"
fi

echo "This is a custom script running all workers and the master!"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add that we currently always deploy a worker on the master node, so we do not support a 'worker only' install.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nm... just saw docs below.

location: all-nodes
```

The above configuration takes the absolute path `/custom-scripts/` and upload every file within it. These files will all be executed, although order of exection is not guarenteed. If your custom scripts have dependencies, specify the order by providing the full path to the file as seen in the first example.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo "upload" -> "uploads"

if custom_scripts is None:
return

os.mkdir(os.path.join(constants.ROOT_PATH, 'node_scripts', 'custom-scripts'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the directory exists, do we need to clean it up before copying files over? Can files from a previous cluster create still linger?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... Or is that what line 203 does? May be worth doing it here as a first step in case the process exists unexpectedly between lines 197 and 203 since it shouldn't harm anything and provides an extra guarantee.

Copy link
Member Author

@jafreck jafreck Sep 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should actually clean up before we copy and after. This will remove lingering files from a previous create, copy over the expected files, upload them and then delete them.

The only way someone could have lingering files, though, is if the program crashed between 197 and 203, since 203 does remove all the files.

dtde/config.py Outdated
@@ -139,8 +139,8 @@ def _merge_dict(self, config):
if 'password' in config and config['password'] is not None:
self.password = config['password']

if 'custom_script' in config and config['custom_script'] is not None:
self.custom_script = config['custom_script']
if 'custom_scripts' in config and config['custom_scripts'] not in [[None], None]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'custom_scripts' used several times. Should be a constant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this means.

@@ -1,7 +1,7 @@
#!/bin/bash

# This file is the entry point of the docker container.
# It will run the custom scripts if present and start spark.
# It will setup WASB and start Spark.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth commenting that it currently uses the same storage account as the one configured in the secrets.yaml (or conversely that the one in secrets.yaml is used for moving data around for this tool to work)

except FileNotFoundError as e:
print(e)
except IOError as e:
print(e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final except catch all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error checking is just meant to ensure that the directory of scripts exists. However, we probably want to catch errors in _run_script() though so user script crashes are caught.

@jafreck jafreck merged commit 37c5d87 into Azure:master Oct 2, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants