Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
120 commits
Select commit Hold shift + click to select a range
40c05b6
updated cluster tools for superEEG project
jeremymanning Nov 6, 2016
40f85e6
added some example lines for superEEG
jeremymanning Nov 7, 2016
8be1133
updated config
paxtonfitzpatrick Mar 21, 2019
9be464d
updated job submit script
paxtonfitzpatrick Mar 21, 2019
ee8a5c6
renamed, added analysis file
paxtonfitzpatrick Mar 21, 2019
783efb2
wrote cruncher script
paxtonfitzpatrick Mar 21, 2019
5bb9f39
updated output path
paxtonfitzpatrick Mar 21, 2019
3600e5f
removed extra script
paxtonfitzpatrick Mar 21, 2019
8f03d14
removed supereeg script
paxtonfitzpatrick Mar 21, 2019
39f3845
added .DS_Store to .gitignore
paxtonfitzpatrick Mar 21, 2019
0410934
renamed eventseg config
paxtonfitzpatrick Mar 21, 2019
34e76cf
changed qsub to mksub for netID accounts
paxtonfitzpatrick Mar 21, 2019
ceb171b
updated import statements
paxtonfitzpatrick Mar 21, 2019
594fb56
activate python 3.4 conda env instead of loading python 2.7
paxtonfitzpatrick Mar 24, 2019
d3adc0d
updated conda envs
paxtonfitzpatrick Mar 24, 2019
094bd3f
eventseg_collector script
paxtonfitzpatrick Mar 24, 2019
c189555
Merge pull request #3 from paxtonfitzpatrick/eventseg
paxtonfitzpatrick Mar 24, 2019
0f4682f
wrote collector script
paxtonfitzpatrick Mar 25, 2019
6064897
fixed some mistakes
paxtonfitzpatrick Mar 28, 2019
c094e57
Merge pull request #1 from ContextLab/eventseg
paxtonfitzpatrick Aug 11, 2019
dba3851
updated config
paxtonfitzpatrick Aug 13, 2019
d63abfc
updated submit file
paxtonfitzpatrick Aug 13, 2019
a5a2e74
updated main code
paxtonfitzpatrick Aug 13, 2019
62f2610
cleared collector
paxtonfitzpatrick Aug 13, 2019
c34559f
cleared collector
paxtonfitzpatrick Aug 13, 2019
0d63c20
renamed scripts, removed collector
paxtonfitzpatrick Aug 13, 2019
769ae6a
quick readme update
paxtonfitzpatrick Aug 13, 2019
c50e3c3
updated jobname
paxtonfitzpatrick Aug 13, 2019
94b2960
fixed job script in model_scripts_submit
paxtonfitzpatrick Aug 13, 2019
fa170bf
updated model scripts config options
paxtonfitzpatrick Aug 13, 2019
3c85e14
wrote cluster scripts for event segmentation
paxtonfitzpatrick Aug 13, 2019
4ab1313
updated config
paxtonfitzpatrick Aug 14, 2019
26a0cab
fixed bug in hostname for discover/ndoli
paxtonfitzpatrick Aug 14, 2019
7d6a7a1
fixed bug with bash scripts caused by spaces in filename
paxtonfitzpatrick Aug 14, 2019
1040e7b
added script to check job success
paxtonfitzpatrick Aug 14, 2019
73088b0
added an output error text file
paxtonfitzpatrick Aug 14, 2019
6a961c9
added line to load python module, updated conda command
paxtonfitzpatrick Aug 14, 2019
d8e6ad1
lumping script segmentation into 3 jobs
paxtonfitzpatrick Aug 14, 2019
73a57e6
updated eventseg scripts
paxtonfitzpatrick Aug 14, 2019
98af11f
wrote collector script
paxtonfitzpatrick Aug 14, 2019
54b240d
fixed some bugs in eventseg scripts
paxtonfitzpatrick Aug 14, 2019
eb4ec3d
wrote optimze k scripts
paxtonfitzpatrick Aug 15, 2019
2558903
Merge branch 'eventseg' into mind2019
paxtonfitzpatrick Aug 26, 2019
86a8a7d
Merge pull request #2 from paxtonfitzpatrick/mind2019
paxtonfitzpatrick Aug 26, 2019
8788e40
removed extra scripts
paxtonfitzpatrick Aug 26, 2019
3b878dd
updated config
paxtonfitzpatrick Aug 26, 2019
5ad34c7
updated submit script
paxtonfitzpatrick Aug 26, 2019
67e949f
wrote cruncher script
paxtonfitzpatrick Aug 27, 2019
39e29b5
wrote collector/plotting script
paxtonfitzpatrick Aug 27, 2019
3fa3cee
some bug fixes to scripts
paxtonfitzpatrick Aug 27, 2019
f8325c5
updated collector script
paxtonfitzpatrick Aug 27, 2019
4f9500c
added catch for error when fitting predictions to large number of events
paxtonfitzpatrick Aug 28, 2019
dea684e
added pycharm files to gitignore
paxtonfitzpatrick Oct 7, 2019
b5f161c
updated resampling scale, dealt with indexerror issue
paxtonfitzpatrick Oct 7, 2019
21e7a7d
fix indexerror for segmentation
paxtonfitzpatrick Oct 7, 2019
4bcd5e5
moved eventseg scripts to folder
paxtonfitzpatrick Oct 9, 2019
478fb42
added scripts for optimizing embedding seed
paxtonfitzpatrick Oct 9, 2019
fc81c55
added a fun ascii logo a la Discovery
paxtonfitzpatrick Dec 17, 2019
463ccd0
moved cluster scripts to their own directory
paxtonfitzpatrick Dec 17, 2019
362c9f5
generalized to all pycharm files
paxtonfitzpatrick Dec 17, 2019
0c61158
added template config file
paxtonfitzpatrick Dec 17, 2019
5548f66
ignore all files in configs dir except for template
paxtonfitzpatrick Dec 17, 2019
e22100c
added notebook dir to gitignore
paxtonfitzpatrick Dec 17, 2019
1bc4771
removed options to have scripts as templates
paxtonfitzpatrick Dec 17, 2019
b46b27b
added helpers file
paxtonfitzpatrick Dec 17, 2019
6d3fe9a
added import for helpers
paxtonfitzpatrick Jan 14, 2020
8e5c77e
added template config
paxtonfitzpatrick Jan 14, 2020
f0381a6
started config parser
paxtonfitzpatrick Jan 14, 2020
d88f821
moved bash script
paxtonfitzpatrick Jan 14, 2020
4a83f52
resolved merge conflicts
paxtonfitzpatrick Jan 14, 2020
5ee6072
resovled extra conflicts with new branch
paxtonfitzpatrick Jan 14, 2020
b104586
added requirements file
paxtonfitzpatrick Jan 14, 2020
1238e11
moved helpers, wrote template config.py
paxtonfitzpatrick Jan 14, 2020
d12403b
removed resultsdir from config
paxtonfitzpatrick Jan 14, 2020
54c60cd
reusable prompt function for command line input
paxtonfitzpatrick Jan 14, 2020
5550742
consolidated helper functions into single file
paxtonfitzpatrick Jan 14, 2020
16bd8d6
moved helpers file to top level, made hidden
paxtonfitzpatrick Jan 14, 2020
273ba0e
wrote md5_checksum
paxtonfitzpatrick Jan 14, 2020
08fa40b
wrote function to upload scripts to cluster
paxtonfitzpatrick Jan 14, 2020
b46cd01
wrote main for stand-alone upload_scripts
paxtonfitzpatrick Jan 15, 2020
7f7f7df
added fallback function to try to load config file if not provided
paxtonfitzpatrick Jan 15, 2020
1e40198
organization
paxtonfitzpatrick Jan 15, 2020
c3ac52f
added spec for conda env & name
paxtonfitzpatrick Jan 15, 2020
264f2f7
added job_config entry for name of conda env
paxtonfitzpatrick Jan 15, 2020
8f032f3
cut out code, placed in _helpers function
paxtonfitzpatrick Jan 15, 2020
3012a48
beginning of remote_submit function with docstring
paxtonfitzpatrick Jan 15, 2020
2744935
moved env_name option from main config to job_config
paxtonfitzpatrick Jan 15, 2020
bfbd2a2
moved all config opts about environment to job_config
paxtonfitzpatrick Jan 15, 2020
64b4983
min viable remote submission function
paxtonfitzpatrick Jan 15, 2020
17ec12b
helper func to format remote commands
paxtonfitzpatrick Jan 19, 2020
159b569
wrote cli for remote_submit
paxtonfitzpatrick Jan 19, 2020
aab4605
wrote template and info for submit.py
paxtonfitzpatrick Jan 19, 2020
a5f49cc
allow module and env name to be filled in from config in template script
paxtonfitzpatrick Jan 19, 2020
90ec699
narrowed exceptions, started reorganizing code
paxtonfitzpatrick Jan 19, 2020
de17f82
add options for configuring email notifications
paxtonfitzpatrick Jan 19, 2020
ccde557
WIP submit script revision
paxtonfitzpatrick Jan 19, 2020
0732813
updated config formatting, docs
paxtonfitzpatrick Jan 20, 2020
a82bb2d
re-added arg for job_command writing
paxtonfitzpatrick Jan 20, 2020
b9b8a41
helper for writing submit job script
paxtonfitzpatrick Jan 20, 2020
8ebe9b5
set default walltime
paxtonfitzpatrick Jan 20, 2020
e182397
updated to submit jobs from job script
paxtonfitzpatrick Jan 20, 2020
836a527
fixed string template formatting issue
paxtonfitzpatrick Jan 20, 2020
9bf126a
update fallback param for email address
paxtonfitzpatrick Jan 21, 2020
56bbed7
minor formatting
paxtonfitzpatrick Jan 21, 2020
7ac7cde
WIP reworking submission script:
paxtonfitzpatrick Jan 21, 2020
06a6c1d
minimum working updated submission script
paxtonfitzpatrick Jan 21, 2020
b01b6a4
removed old template script
paxtonfitzpatrick Jan 21, 2020
ea37524
removed docstring for old param
paxtonfitzpatrick Jan 21, 2020
5b42cda
WIP script for resubmitting failed jobs
paxtonfitzpatrick Jan 21, 2020
9feff3e
removed submit command config option, easier to set dynamically
paxtonfitzpatrick Jan 21, 2020
055b101
wrote get_qstat helper
paxtonfitzpatrick Jan 21, 2020
b9cbfb5
changed stdout text to differentiate job script from job name
paxtonfitzpatrick Jan 21, 2020
d39ad5b
changed default behavior of get_qstat
paxtonfitzpatrick Jan 21, 2020
4362ce0
added option to set confirmation behavior for resubmission
paxtonfitzpatrick Jan 21, 2020
ba0741b
updated parse_config
paxtonfitzpatrick Jan 21, 2020
06c12c2
finished initial script for remote resubmission of failed scripts
paxtonfitzpatrick Jan 21, 2020
8eaed1e
make config ini file
paxtonfitzpatrick Jul 16, 2020
6ae571c
update filename
paxtonfitzpatrick Jul 16, 2020
00c4157
checking in old changes (I don't know what I was doing with some of t…
paxtonfitzpatrick Oct 30, 2020
6c696a6
fixed merge conflicts from master
paxtonfitzpatrick Oct 30, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@

.idea/cluster-tools-dartmouth.iml

*.pyc

.idea/*
*.pyc
*.DS_Store
configs/
!*.gitkeep
notebooks/
34 changes: 31 additions & 3 deletions readme.txt → README.md
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,7 +1,32 @@
Cluster Tools for Dartmouth
```
____ ____ _ ____ _ _ _______ _
/ ___| _ \| | / ___| |_ _ ___| |_ ___ _ __ |__ __| ___ ___ | | ___
| | | | | | | | | | | | | / __| __/ _ \ '__| | | / _ \ / _ \| |/ __|
| |__ | |_| | |___ | |___| | |_| \__ \ || __/ | | | | (_) | (_) | |\__ \
\____|____/|_____| \____|_|\__,_|___/\__\___|_| | | \___/ \___/|_||___/

Author: Jeremy R. Manning (jeremy@dartmouth.edu)
Date: October 16, 2016
```

This toolbox contains a simple setup for deploying jobs on Dartmouth's high-performance computing clusters (Discovery, Ndoli, etc.)

To run the main analysis, use:

python supereeg_submit.py

If run on Discovery, it'll submit a batch of jobs to run in parallel. If run on a personal computer it'll run each job
in sequence.

NOTE: jobs have not been implemented yet





=======

Authors: Paxton C. Fitzpatrick and Jeremy R. Manning (jeremy@dartmouth.edu)
Created: October 16, 2016
Updated: December 16, 2019

This repository provides a set of tools for submitting jobs on Dartmouth's
Discovery and Ndoli computing clusters. With minimal modification, they may
Expand Down Expand Up @@ -44,6 +69,9 @@ you will need to specify the following:
+ modules: A list of modules that need to be loaded prior to executing your
job.

******** NOTE ********
Steps below are outdated

2.) create_and_submit_jobs.py. This script is what you'll run to actually create
your job scripts and submit (or run) them. You'll want to modify the code in the
indicated section to point to your job script and call any arguments you need to
Expand Down
211 changes: 211 additions & 0 deletions _helpers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,211 @@
import hashlib
import os
import sys
from os.path import isfile, realpath, join as opj, sep as pathsep
from string import Template
from configparser import ConfigParser


def attempt_load_config():
"""
tries to load config file from expected path in instances where neither a
filepath or dict-like object is provided
"""
splitpath = realpath(__file__).split(pathsep)
try:
try:
# get path to project root directory
splitroot = splitpath[: splitpath.index('cluster-tools-dartmouth') + 1]
project_root = pathsep.join(splitroot)
config_dir = opj(project_root, 'configs')
except ValueError as e:
# pass exceptions onto broad outer exception for function
raise FileNotFoundError(f"cluster-tools-dartmouth not found in path\
{realpath(__file__)}").with_traceback(e.__traceback__)

configs = os.listdir(config_dir)
# filter out hidden files and the template config
configs = [f for f in configs if not (f.startswith('template')
or f.startswith('.'))
]
if len(configs) == 1:
config_path = opj(config_dir, configs[0])
config = parse_config(config_path)
return config
else:
# fail if multiple or no config files are found
raise FileNotFoundError(f"Unable to determine which config file to \
read from {len(configs)} choices in {config_dir}")

except FileNotFoundError as e:
raise FileNotFoundError("Failed to load config file from expected \
location").with_traceback(e.__traceback__)


def fmt_remote_commands(commands):
"""
Formats a list-like iterable of shell commands to be run in the SshShell
instance. Necessary because underlying Python SSH client (Paramiko) won't
run any state changes between commands. So we run them all at once.
"""
assert hasattr(commands, "__iter__"), \
"Commands passed to fmt_remote_commands must be as an iterable (i.e., \
list-like) object"

executable = ['bash', '-c']
# TODO: switch to ; sep?
commands_str = [' && '.join(commands)]

return executable + commands_str


def get_qstat(remote_shell, options=None):
"""
Return the status of running "qstat" on the cluster, optionally with a
filter for the job's status
:param remote_shell: (spurplus.SshShell instance)
:param options: (str)
options to run along with the "qstat" command. For further
information, run "get_qstat(remote_shell, options=['man'])
locally or "man qstat" from the cluster.
:return qstat_output: (str) output of running command on the cluster
"""
if options is None:
cmd = ['qstat']
elif options == 'man':
cmd = ['man qstat']
elif not options.startswith('-'):
cmd = ['qstat -' + options]
else:
cmd = ['qstat ' + options]

cmds_fmt = fmt_remote_commands(cmd)
return remote_shell.check_output(cmds_fmt)





def md5_checksum(filepath):
"""
computes the MD5 checksum of a local file to compare against remote

NOTE: MD5 IS CONSIDERED CRYPTOGRAPHICALLY INSECURE
(see https://en.wikipedia.org/wiki/MD5#Security)
However, it's still very much suitable in cases (like ours) where one
wouldn't expect **intentional** data corruption
"""
hash_md5 = hashlib.md5()
with open(filepath, 'rb') as f:
# avoid having to read the whole file into memory at once
for chunk in iter(lambda: f.read(4096), b''):
hash_md5.update(chunk)
return hash_md5.hexdigest()


def parse_config(config_path):
"""
parses various user-specifc options from config file in configs dir
"""
config_path = realpath(config_path)
if not isfile(config_path):
raise FileNotFoundError(f'Invalid path to config file: {config_path}')

raw_config = ConfigParser(inline_comment_prefixes='#')
with open(config_path, 'r') as f:
raw_config.read_file(f)

config = dict(raw_config['CONFIG'])
config['confirm_overwrite_on_upload'] = raw_config.getboolean(
'CONFIG', 'confirm_overwrite_on_upload'
)
config['confirm_resubmission'] = raw_config.getboolean(
'CONFIG', 'confirm_resubmission'
)
return config


def prompt_input(question, default=None):
"""
given a question, prompts user for command line input
returns True for 'yes'/'y' and False for 'no'/'n' responses
"""
assert default in ('yes', 'no', None), \
"Default response must be either 'yes', 'no', or None"

valid_responses = {
'yes': True,
'y': True,
'no': False,
'n': False
}

if default is None:
prompt = "[y/n]"
elif default == 'yes':
prompt = "[Y/n]"
else:
prompt = "[y/N]"

while True:
sys.stdout.write(f"{question}\n{prompt}")
response = input().lower()
# if user hits return without typing, return default response
if (default is not None) and (not response):
return valid_responses[default]
elif response in valid_responses:
return valid_responses[response]
else:
sys.stdout.write("Please respond with either 'yes' (or 'y') \
or 'no' (or 'n')\n")


def write_remote_submitter(remote_shell, job_config, env_activate_cmd, env_deactivate_cmd, submitter_walltime='12:00:00'):
remote_dir = job_config['workingdir']
# TODO: ability to handle custom-named submission script
submitter_fpath = opj(remote_dir, 'submit_jobs.sh')

try:
assert remote_shell.is_dir(remote_dir)
except AssertionError as e:
raise ValueError(
f"Can't create job submission script in dir: {remote_dir}. \
Intended directory is an existing file."
).with_traceback(e.__traceback__)
except FileNotFoundError as e:
raise FileNotFoundError(
f"Can't create job submission script in dir: {remote_dir}. \
Intended directory does not exist."
).with_traceback(e.__traceback__)

template_vals = {
'jobname': job_config['jobname'],
'walltime': submitter_walltime,
'modules': job_config['modules'],
'activate_cmd': env_activate_cmd,
'deactivate_cmd': env_deactivate_cmd,
'env_name': job_config['env_name'],
'cmd_wrapper': job_config['cmd_wrapper'],
'submitter_script': submitter_fpath
}

template = Template(
"""#!/bin/bash -l

#PBS -N ${jobname}-submitter
#PBS -q default
#PBS -l nodes=1:ppn=1
#PBS -l walltime=${walltime}
#PBS -m bea

module load $modules
$activate_cmd $env_name

$cmd_wrapper $submitter_script

$deactivate_cmd"""
)

content = template.substitute(template_vals)
remote_shell.write_text(submitter_fpath, content)
return submitter_fpath
3 changes: 3 additions & 0 deletions cluster_scripts/collector.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/usr/bin/python

from .config import job_config
28 changes: 28 additions & 0 deletions cluster_scripts/config.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
########################################
# JOB SUBMISSION & RUNTIME OPTIONS #
########################################
# See the README for a guide to setting these values

[Paths]
project_root = /dartfs/rc/lab/D/DBIC/CDL/<YOUR_USERNAME>/<THIS_PROJECT_NAME>
data_dir = data
script_dir = scripts

[Job Environment]
modules = python
env_type = conda
env_name =
cmd_wrapper = python

[Job Runtime]
jobname =
queue = largeq
n_nodes = 1
ppn = 1
wall_time = 1:00:00

[Job Notifications]
event_keys =
email_address =

[Extras]
23 changes: 23 additions & 0 deletions cluster_scripts/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
########################################
# DO NOT MODIFY THIS FILE #
########################################
# This file exists to make various options from config.ini available to
# cruncher scripts at runtime. All options should be set there rather
# than in this file.

from configparser import ConfigParser
from pathlib import Path


config_path = Path(__file__).resolve().parent.joinpath('config.ini')

job_config = ConfigParser()
with config_path.open() as f:
job_config.read_file(f)



# job_config['datadir'] = opj(job_config['startir'], 'data')
# job_config['workingdir'] = opj(job_config['startir'], 'scripts')
# job_config['scriptdir'] = opj(job_config['workingdir'], 'scripts')
# job_config['lockdir'] = opj(job_config['workingdir'], 'locks')
3 changes: 3 additions & 0 deletions cluster_scripts/cruncher.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/usr/bin/python

from .config import job_config
Loading