Skip to content

ENH: Attempt to use hard links for data sink. #1161

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 5, 2015
Merged

ENH: Attempt to use hard links for data sink. #1161

merged 1 commit into from
Aug 5, 2015

Conversation

hjmjohnson
Copy link
Contributor

In many cases on Unix, the data sink is on the same
physical drive as the internal nipype cache. In
that case, we can use a hard link to save both
time necessary to duplicate data, and space
necesssary to hold the same data at two different
inodes. This allows removal of the cache
directory without modifying the results
directory.

In large analysis, this optimization can save
several terabytes of storage consumption.

@@ -32,6 +32,8 @@
from nipype.utils.misc import human_order_sorted
from nipype.external import six

from nipype.utils.config import NipypeConfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use relative path here:

from ... import config

and use config directly. no need to initialize NipypeConfig

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@mwaskom
Copy link
Member

mwaskom commented Aug 4, 2015

Nice!

@hjmjohnson
Copy link
Contributor Author

@mwaskom @satra

I tested this in a real pipeline analysis last night, and it seems to be working as expected. (i.e. saving a TON of disk space :)

I think this is ready to merge.

@satra
Copy link
Member

satra commented Aug 5, 2015

@hjmjohnson - could we please add this to the CHANGES file?

@satra
Copy link
Member

satra commented Aug 5, 2015

also add config option to config.rst

In many cases on Unix, the data sink is on the same
physical drive as the internal nipype cache.  In
that case, we can use a hard link to save both
time necessary to duplicate data, and space
necesssary to hold the same data at two different
inodes.  This allows removal of the cache
directory without modifying the results
directory.

In large analysis, this optimization can save
several terabytes of storage consumption.
@hjmjohnson
Copy link
Contributor Author

@satra Latest requested changes to config_file.rst and CHANGELOG committed.

satra added a commit that referenced this pull request Aug 5, 2015
ENH: Attempt to use hard links for data sink.
@satra satra merged commit 1c27198 into nipy:master Aug 5, 2015
@hjmjohnson hjmjohnson deleted the TryHardLinks branch January 19, 2016 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants