Skip to content

Conversation

Copy link

Copilot AI commented Oct 1, 2025

Description

This PR addresses issue #[issue_number] by modifying run_quick.py to use a temporary directory for the workflow output, improving efficiency when using network drives for storage.

Changes

Previously, run_quick.py already used a temporary directory for creating the input BIDS dataset, but wrote workflow output directly to the final output directory. This could cause significant performance degradation when the output directory is on network storage (e.g., NFS, CIFS), as all intermediate files and workflow I/O would traverse the network.

Now, the workflow writes to a temporary output directory (a subdirectory within the same temp_dir used for input), and only the final subject results (hippunfold/sub-{subject}/) are copied to the final output location after successful completion.

Key Implementation Details

  1. Temporary output directory: Created as temp_dir/output/ alongside the temporary input BIDS dataset
  2. Workflow execution: Runs entirely in the temporary location (e.g., local disk)
  3. Result copy: After successful completion, copies only hippunfold/sub-{subject}/ to the final output directory
  4. Error handling: On workflow failure, no copy is performed, leaving the final output unchanged
  5. Overwrite behavior: If the subject directory already exists in the final output, it is replaced

Example

hippunfold-quick \
  --input /data/subject.nii.gz \
  --output /network/storage/results \
  --subject 001 \
  --modality T1w

Workflow execution (local disk):

/tmp/tmpXXXXX/
├── anat/sub-001/              # Input BIDS
│   └── sub-001_T1w.nii.gz
└── output/                    # All workflow I/O happens here
    ├── hippunfold/sub-001/    # Final results
    ├── work/                  # Intermediate files (not copied)
    ├── logs/                  # Logs (not copied)
    └── .snakemake/            # Metadata (not copied)

Final output (network storage):

/network/storage/results/
└── hippunfold/
    └── sub-001/               # Only this gets copied back
        ├── anat/
        ├── surf/
        ├── coords/
        └── qc/

Benefits

  • Performance: All workflow I/O happens on local disk, avoiding network overhead
  • Efficiency: Only final results are copied to network storage, not intermediate files (work/, logs/, .snakemake/)
  • Safety: Atomic updates - only successful runs update the final output
  • Compatibility: Fully backward compatible, no changes to CLI arguments or behavior

Testing

  • Code formatted with black and isort
  • Syntax validation passed
  • Manual testing of copy logic verified
  • No breaking changes to existing functionality
Original prompt

This section details on the original issue you should resolve

<issue_title>Use tempdir for writing output dir in run_quick.py</issue_title>
<issue_description>Using a tempdir (eg local disk) for writing to the output dir can improve efficiency when using network drives for storage otherwise. The run_quick.py console script already uses a tempdir to create the input bids dataset, we should also use a sub-directory in that folder as the hippunfold output dir, so the workflow gets run from there. Would need to then add a command to copy back from there to the final output dir (copying only the sub-{subject} dir) after successful completion.</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes #518

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

- Created temporary output directory within the same temp_dir used for input
- Modified workflow to write to temp output instead of final output
- Added logic to copy hippunfold/sub-{subject} directory to final output after successful completion
- Only copies results on successful completion (no copy on error)
- Handles overwrite case by removing existing subject directory before copy

Co-authored-by: akhanf <11492701+akhanf@users.noreply.github.com>
Copilot AI changed the title [WIP] Use tempdir for writing output dir in run_quick.py Use tempdir for output in run_quick.py to improve efficiency with network storage Oct 1, 2025
Copilot AI requested a review from akhanf October 1, 2025 14:34
Copy link
Collaborator

@jordandekraker jordandekraker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is using the hippunfold_v1.X output structure, i.e.
outdir/hippunfold/sub-01
, instead of version 2
outdir/sub-01

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants