Skip to content

ResourceMultiProc plugin and runtime profiler #1372

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 94 commits into from
May 12, 2016

Conversation

pintohutch
Copy link
Contributor

  1. @carolFrohlich changed the MultiProc plugin to be the ResourceMultiProc plugin. This plugin enables multithreading during workflow execution but also ensures that the workflow will not use more of the system resources than the user specifies with the plugin arguments 'n_procs' and 'memory' at the workflow level. Individual node-level resource usage can be specified via the node._interface.num_threads and node._interface.estimated_memory parameters (defaults are 1 thread and 1 gb, respectively).
  2. @carolFrohlich added callback logger that will record estimated node resources and runtimes into a log file via json format.
  3. @carolFrohlich added a draw_gantt_chart module which will create a gantt chart graphic html file of the resources utilized by each node during workflow execution from start to finish
  4. @dclark87 added a runtime profiler using the psutil and memory_profiler python packages. This profiler will be enabled as long as those packages are installed, and disabled if not installed. The profiler records the actual number of threads and GB of RAM used by a node (as long as it has a nipype interface - will not work with util.function nodes). These values are stored as 'runtime_threads' and 'runtime_memory' in both the runtime Bunch object and result dictionary returned from running the node interface. Because of this, they are also written to the provenance file if that is enabled. The new callback logger will look for these values directly and log them as well.
  5. All new features include additional unit tests added to the test folders within each sub-packagae

Thanks to @ccraddock for design and debug help

pintohutch and others added 30 commits January 13, 2016 14:25
Re-basing code with nipype master branch
…e loop in memory_profiler was executing node twice when it didnt finish running the first time
estimated_memory_gb = 1.0
try:
runtime_threads = float(node['runtime_threads'])
except:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KeyError class?

@chrisgorgo
Copy link
Member

Ok I finished reading the code. Overall this is a great contribution. Apart from the things I mentioned above the resource_sched_profiler should be listed in https://raw.githubusercontent.com/nipy/nipype/master/doc/users/index.rst

Sorry for picking on the try-catch-all. Being specific about exception classes will pay off in the future. Silently dealing with all potential exceptions makes debugging incredibly hard (believe me, I learned this the hard way!).

@satra
Copy link
Member

satra commented Apr 29, 2016

@dclark87 - one question, is there any reason this information is being kept separate from the provenance record that nipype produces?

@pintohutch
Copy link
Contributor Author

pintohutch commented Apr 29, 2016

@satra - there is no reason this information is not in the provenance record. We currently store the runtime profiling info in the node's result.runtime as well as the callback logger, but it doesn't hurt to have it in more places. Where is the provenance code?

@coveralls
Copy link

coveralls commented May 9, 2016

Coverage Status

Coverage decreased (-0.6%) to 71.842% when pulling 46f3275 on FCP-INDI:resource_multiproc into aeecd2f on nipy:master.

@coveralls
Copy link

coveralls commented May 9, 2016

Coverage Status

Coverage decreased (-0.6%) to 71.842% when pulling 41c6928 on FCP-INDI:resource_multiproc into aeecd2f on nipy:master.

@coveralls
Copy link

coveralls commented May 9, 2016

Coverage Status

Coverage decreased (-0.6%) to 71.852% when pulling f0a3889 on FCP-INDI:resource_multiproc into aeecd2f on nipy:master.

@coveralls
Copy link

coveralls commented May 9, 2016

Coverage Status

Coverage decreased (-0.5%) to 72.006% when pulling b566b22 on FCP-INDI:resource_multiproc into aeecd2f on nipy:master.

@coveralls
Copy link

coveralls commented May 9, 2016

Coverage Status

Coverage decreased (-0.4%) to 72.036% when pulling e7eac16 on FCP-INDI:resource_multiproc into aeecd2f on nipy:master.

generate_gantt_chart('/home/user/run_stats.log', cores=8)
# ...creates gantt chart in '/home/user/run_stats.log.html'

The `generate_gantt_chart`` function will create an html file that can be viewed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing ` ?

@chrisgorgo
Copy link
Member

LGTM!

@satra satra merged commit 9b0442a into nipy:master May 12, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants