Skip to content

[ENH] Add MRIsCombine to FreeSurfer utils. #1948

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Apr 19, 2017
Merged

Conversation

tsalo
Copy link
Member

@tsalo tsalo commented Apr 9, 2017

MRIsCombine uses mris_convert to combine two surface files.

@codecov-io
Copy link

codecov-io commented Apr 9, 2017

Codecov Report

Merging #1948 into master will increase coverage by <.01%.
The diff coverage is 76.66%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #1948      +/-   ##
=========================================
+ Coverage    72.5%   72.5%   +<.01%     
=========================================
  Files        1063    1064       +1     
  Lines       54189   54219      +30     
  Branches     7820    7825       +5     
=========================================
+ Hits        39288   39311      +23     
- Misses      13682   13688       +6     
- Partials     1219    1220       +1
Flag Coverage Δ
#smoketests 72.5% <76.66%> (ø) ⬆️
#unittests 70.04% <76.66%> (ø) ⬆️
Impacted Files Coverage Δ
nipype/interfaces/freesurfer/__init__.py 100% <ø> (ø) ⬆️
nipype/interfaces/freesurfer/utils.py 62.39% <68.75%> (+0.07%) ⬆️
...terfaces/freesurfer/tests/test_auto_MRIsCombine.py 85.71% <85.71%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c15df99...36fd9c1. Read the comment docs.

@codecov-io
Copy link

codecov-io commented Apr 9, 2017

Codecov Report

Merging #1948 into master will decrease coverage by 0.01%.
The diff coverage is 61.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1948      +/-   ##
==========================================
- Coverage    72.5%   72.49%   -0.02%     
==========================================
  Files        1063     1065       +2     
  Lines       54209    54268      +59     
  Branches     7825     7836      +11     
==========================================
+ Hits        39305    39341      +36     
- Misses      13682    13704      +22     
- Partials     1222     1223       +1
Flag Coverage Δ
#smoketests 72.49% <61.66%> (-0.02%) ⬇️
#unittests 70.04% <61.66%> (-0.01%) ⬇️
Impacted Files Coverage Δ
nipype/interfaces/freesurfer/__init__.py 100% <ø> (ø) ⬆️
...ces/freesurfer/tests/test_auto_FSSurfaceCommand.py 100% <100%> (ø)
nipype/interfaces/freesurfer/base.py 66.66% <28.57%> (-5.34%) ⬇️
nipype/interfaces/freesurfer/utils.py 62.13% <54.16%> (-0.18%) ⬇️
...terfaces/freesurfer/tests/test_auto_MRIsCombine.py 85.71% <85.71%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 482ac61...7c3125d. Read the comment docs.

desc='File to be combined with in_file2')
in_file2 = File(exists=True, mandatory=True, position=2,
argstr='%s',
desc='File to be combined with in_file1')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you specifically want in_file1 and in_file2 ports? This could be made a list like so:

in_files = List(File(Exists=True), maxlen=2, minlen=2,
                mandatory=True, position=1, argstr='--combinesurfs %s', desc="...")

(I think. I'd check this syntax.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize traits.List allowed for maxlen and minlen. I definitely prefer it your way. Just tested it and it works, so I'll commit it in a second.

if any(self.inputs.out_file.startswith(pre) for pre in ['lh.', 'rh.']):
outputs['out_file'] = self.inputs.out_file
else:
outputs['out_file'] = 'lh.' + self.inputs.out_file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For your doctest example, you're going to get lh.out.stl. Is that what mris_convert produces? And if so, is that the same whether you pass lh.pial or rh.pial as in_file1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I've seen:

  • mris_convert --combinesurfs lh.pial rh.pial out.stl creates lh.out.stl
  • mris_convert --combinesurfs rh.pial lh.pial out.stl creates lh.out.stl
  • mris_convert --combinesurfs rh.pial rh.smoothwm out.stl creates lh.out.stl
  • mris_convert --combinesurfs lh.pial rh.pial rh.out.stl creates rh.out.stl

So it doesn't seem to matter what order you pass in the files or even if both files are from the right hemisphere. As long as the output file doesn't start with lh. or rh., it will prepend lh. to the output filename.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha. That's awful. Can you add a short comment explaining that this is a FreeSurfer behavior you're accommodating?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an explanation to the docstring for _list_outputs in a7bafda. Should be clear now.

Copy link
Member

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable. Will merge after tests pass, unless there are any objections.

@effigies
Copy link
Member

Actually, I'm going to be obnoxious. That lh.* behavior is so wildly unexpected, it should be in the docstring for the entire interface so it shows up in the generated documentation.

Also, can you see if out_file="./bh.pial" gives different behavior? From my experience with FreeSurfer, I suspect they'll interpret a name with path information in it literally.

@tsalo
Copy link
Member Author

tsalo commented Apr 10, 2017

Putting in the path information worked. It wrote out the file given, with no changes. MRIMarchingCubes deals with this by using ./%s as the out_file argstr, but that seems to cause problems if you put in an out_file with an absolute path (like /home/tsalo/out.stl, because then it tries to combine them as .//home/tsalo/out.stl and then it fails.

Rather than putting the path in the argstr, is there a way to reformat it in something like _list_outputs, but for inputs? Is that a thing?

Whatever the fix is, it'll probably need to be implemented in other FreeSurfer functions- especially MRIMarchingCubes.

@effigies
Copy link
Member

Yeah, I wouldn't do it in the argstr, and I have no doubt that lots of these interfaces need clearing up. Doing something in _list_outputs is a little late, to my mind, especially if mris_convert likes, as other functions do, to put outputs in the same directory as the inputs, not in the current working directory. (If it outputs to the current working directory, you can pick out a better outfile name, rename and return the better name.)

MRIsExpand does a bit of hackery in _get_filecopy_info, which is only run when in a Node context, giving the advantage the the command is run as advertised outside a Node, and runs properly isolated inside. You could do something similar:

def _get_filecopy_info(self):
    super(MRIsCombine, self)._get_filecopy_info()
    if isdefined(self.inputs.out_file):
        self.inputs.out_file = os.path.abspath(self.inputs.out_file)

It's not elegant, but it is the main pre-execution hook that doesn't also get called when running an interface directly.

We should probably decide on a single hack, and apply it to all the interfaces, or better yet, update the FreeSurferCommand interface to automatically (possibly triggered by a trait attribute) accommodate these naming conventions when run in a Node.

@tsalo
Copy link
Member Author

tsalo commented Apr 11, 2017

I don't know how pervasive the renaming behavior is, but I think it probably only applies to functions that work on surface files. What about a new class inheriting from FSCommand for surface-related interfaces. Something like FSSurfaceCommand. I can add your version of _get_filecopy_info to that and have the interfaces we currently know are affected (MRIsCombine, MRIsConvert, and MRIMarchingCubes) inherit from that instead of FSCommand.

@tsalo
Copy link
Member Author

tsalo commented Apr 13, 2017

I took a look at the different ways different interfaces are handling the same issue.

  • Some (e.g., VolumeMask, Aparc2Aseg, Contrast) output everything to the subject directory using a function called copy_to_subjdir. I'm not sure it's best to force a specific output directory.
  • MRIsExpand checks the beginning of out_file. If the file doesn't have either prefix (lh. or rh.), it takes the prefix from in_file and uses that. It's sort of similar to what I did.
  • Several others (e.g., Apas2Aseg, RelabelHypointensities, MRIsCalc, Jacobian) just set the outputs by applying abspath to the inputs in _list_outputs. I tested out Jacobian, and I don't think this works. It'll run, but shouldn't work in a Node context or if you're trying to use outputs.out_file at all. See below for my process.
  • A few (e.g., SmoothTessellation, MRIMarchingCubes) do something similar to the simple _list_outputs approach, but have _gen_filename and _gen_outfilename to edit the filenames a bit more. If the simple _list_outputs way doesn't work, I don't think these can either. I might be missing something though.

To check the _list_outputs method, I just ran Jacobian:

import nipype.interfaces.freesurfer as fs
jacobian = fs.Jacobian()
jacobian.inputs.in_origsurf = 'lh.pial'
jacobian.inputs.in_mappedsurf = 'lh.pial'
jacobian.inputs.out_file = 'outfile.pial'
res = jacobian.run()

Checking the inputs and outputs:

jacobian.cmdline
'mris_jacobian lh.pial lh.pial outfile.pial'

res.inputs['out_file']
'outfile.pial'

res.outputs['out_file']
'/home/data/nbc/sub-01/surf/outfile.pial'

But the file that is actually created is still lh.outfile.pial.

@effigies
Copy link
Member

Cool. Thanks for rounding up these cases.

Some (e.g., VolumeMask, Aparc2Aseg, Contrast) output everything to the subject directory using a function called copy_to_subjdir. I'm not sure it's best to force a specific output directory.

I believe these were added as part of the reconall_workflow effort. And it actually seems to do something similar to MRIsExpand, which is, in a node context, copy a bunch of implicit dependencies to the working directory. (It can also be used to copy everything to a subject directory, which is probably how it's used in the workflow, which tries to replicate the recon-all script in detail.)

I agree that this probably isn't an appropriate solution to interfaces that aren't used in recon-all or other pipeline scripts.

MRIsExpand checks the beginning of out_file. If the file doesn't have either prefix (lh. or rh.), it takes the prefix from in_file and uses that. It's sort of similar to what I did.

Yes, it's basically what you did, except the naming behavior of mris_expand isn't super surprising. It's really just partially replicating MRISbuildFileName(). There may be naming edge cases that aren't being captured, since we're working from the filename and not its contents.

Several others (e.g., Apas2Aseg, RelabelHypointensities, MRIsCalc, Jacobian) just set the outputs by applying abspath to the inputs in _list_outputs. I tested out Jacobian, and I don't think this works. It'll run, but shouldn't work in a Node context or if you're trying to use outputs.out_file at all. See below for my process.

Yeah, this is definitely a bug. It works under the assumption that nobody actually changes out_file. I suspect this one could use the MRIsExpand._associated_file pretty straightforwardly.

A few (e.g., SmoothTessellation, MRIMarchingCubes) do something similar to the simple _list_outputs approach, but have _gen_filename and _gen_outfilename to edit the filenames a bit more. If the simple _list_outputs way doesn't work, I don't think these can either. I might be missing something though.

These happen before it runs, so I think it should actually be fine, especially if they're generating absolute paths (which I'm pretty sure they should). But again, this correct behavior is predicated on users not specifying out_file manually.


I like the FSSurfaceCommand option. Possibly moving MRIsExpand._associated_file into FSSurfaceCommand would be a good start. And we could try to separate renaming and similar options from _get_filecopy_info by adding an extra hook:

class FSSurfaceCommand(FSCommand):
    ...
    def _get_filecopy_info(self):
        self._normalize_filenames()
        return super(FSSurfaceCommand, self)._get_filecopy_info()

    def _normalize_filenames(self):
        """Filename normalization routine to perform only when run in Node context"""
        pass

Then I'd rewrite MRIsExpand:

class MRIsExpand(FSSurfaceCommand):
    _cmd = 'mris_expand'
    input_spec = MRIsExpandInputSpec
    output_spec = MRIsExpandOutputSpec

    def _list_outputs(self):
        outputs = self._outputs().get()
        outputs['out_file'] = self._associated_file(self.inputs.in_file,
                                                    self.inputs.out_name)
        return outputs

    def _normalize_filenames(self):
        in_file = self.inputs.in_file

        pial = self.inputs.pial
        if not isdefined(pial):
            pial = 'pial'
        self.inputs.pial = self._associated_file(in_file, pial)

        if isdefined(self.inputs.thickness) and self.inputs.thickness:
            thickness_name = self.inputs.thickness_name
            if not isdefined(thickness_name):
                thickness_name = 'thickness'
            self.inputs.thickness_name = self._associated_file(in_file,
                                                               thickness_name)

        self.inputs.sphere = self._associated_file(in_file, self.inputs.sphere)

        super(MRIsExpand, self)._normalize_filenames()

WDYT?

@tsalo
Copy link
Member Author

tsalo commented Apr 14, 2017

Should _normalize_filenames just pass in FSSurfaceCommand or should it try to rename out_file, since that's the most common output? E.g.,

class FSSurfaceCommand(FSCommand):
    ...
    def _get_filecopy_info(self):
        self._normalize_filenames()
        return super(FSSurfaceCommand, self)._get_filecopy_info()

    def _normalize_filenames(self):
        """Filename normalization routine to perform only when run in Node context"""
        if isdefined(self.inputs.out_file):
            self.inputs.out_file = os.path.abspath(self.inputs.out_file) 

If _normalize_filenames deals with inputs.out_file, then there's no need to do anything with _list_outputs, right?
Also, if out_file (or whatever other outputs) has the full path, then is _associated_file necessary? As long as FreeSurfer doesn't add the hemisphere prefix to any output file arguments that include the path, the interface shouldn't need to add the prefix from the input file to the output files, unless that's something it assumes users want.

@effigies
Copy link
Member

This comment gets into the weeds a bit, so tl;dr:

  1. interface.run() should not change the inputs, but Node(interface).run() can.
  2. interface._list_outputs() must account for both contexts, that is, it should describe the behavior of the wrapped program on any input.
  3. I'd be hesitant to make too many assumptions about user's desires. If they want to pass an argument with path information, they can.
  4. If you want to just document the surprising behavior and save refactoring FSSurfaceCommand for another PR, I'm totally fine with that. (Likewise, I'm fine with hashing out all of these issues here; but this does threaten to become a little bigger than perhaps you originally intended.)

To address your comment in detail:

Should _normalize_filenames just pass in FSSurfaceCommand or should it try to rename out_file, since that's the most common output?

I wouldn't do that, because there's no guarantee that all FSSurfaceCommands will have an out_file parameter (even if all of them do now). isdefined only works on existing entries in the spec.

If _normalize_filenames deals with inputs.out_file, then there's no need to do anything with _list_outputs, right?

No, because _list_outputs needs to be agnostic of whether it was run in a Node or not; _normalize_filenames only applies to Nodes.

Also, if out_file (or whatever other outputs) has the full path, then is _associated_file necessary?

_associated_file is also useful for describing where to find inputs that FreeSurfer is expecting to find in the same directory. See MRIsExpand._get_filecopy_info where implicit inputs are defined so they can be copied over. If there's no immediate need to put it in FSSurfaceCommand, don't worry about it. That can go in a later refactor, if it ever becomes relevant.

As long as FreeSurfer doesn't add the hemisphere prefix to any output file arguments that include the path, the interface shouldn't need to add the prefix from the input file to the output files, unless that's something it assumes users want.

But sometimes we (or at least I) want it to add the hemisphere prefix, and I'll signal that by using an out_file specification that FreeSurfer will add a prefix to.

Consider the following cases for mris_expand (where in_file = <indir>/lh.smoothwm):

Input Node output Non-Node output
out_file='graymid' <workdir>/lh.graymid <indir>/lh.graymid
out_file='./graymid' <workdir>/graymid <workdir>/graymid

The right column is determined by how the underlying program treats the inputs. The bottom row is inflexible because the user gave path information. So our only real question is how to treat the top left corner. If we always do out_file = os.path.abspath(out_file) in Nodes, then we get:

Input Node output Non-Node output
out_file='graymid' <workdir>/graymid <indir>/lh.graymid
out_file='./graymid' <workdir>/graymid <workdir>/graymid

Which, in this context, is probably not what the user wanted. Now in your case, I think you want to actually adjust the behavior to be less surprising, like:

Input Node output Non-Node output
out_file='out.stl' <workdir>/out.stl <indir>/lh.out.stl
out_file='./out.stl' <workdir>/out.stl <workdir>/out.stl

This seems completely reasonable. The FSSurfaceCommand approach I proposed gives each interface the flexibility to decide on what the "expected" behavior is.

@tsalo
Copy link
Member Author

tsalo commented Apr 17, 2017

Thanks for the detailed reply. I'll make the minimal changes and leave the full refactoring for later/someone with more knowledge. I think I just have one question left. When you say that interface.run() shouldn't change the inputs, do you mean that running:

import nipype.interfaces.freesurfer as fs
mris = fs.MRIsCombine()
mris.inputs.in_files = ['lh.pial', 'rh.pial']
mris.inputs.out_file = 'out.stl'
mris.cmdline

should give you 'mris_convert --combinesurfs lh.pial rh.pial out.stl' but create lh.out.stl, so that your inputs are untouched in the command? Or, should cmdline give you 'mris_convert --combinesurfs lh.pial rh.pial lh.out.stl' so it creates what your cmdline says it will create?

@effigies
Copy link
Member

The former. mris.inputs.out_file = 'out.stl' should result in out.stl being specified on the command line. But then mris.run().outputs.out_file should be the full path to lh.out.stl. So you should also run your out_file through os.path.abspath before setting it in _list_outputs.

Copy link
Member

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good. Just a couple minor changes.

By including the full path in the filename, we can also avoid this behavior.
"""
def __init__(self, **inputs):
super(FSSurfaceCommand, self).__init__(**inputs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for an __init__ if all it calls is super.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So go straight to _get_filecopy_info?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup.

def _get_filecopy_info(self):
"""Filename normalization routine to perform only when run in Node
context
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment belongs in _normalize_filenames.

"""
path, base = os.path.split(out_name)
if path == '':
_, in_file = os.path.split(in_file)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be path, in_file = .... The logic here is "if out_file doesn't have path information, use in_file's path".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that's my mistake. With mris_convert --combinesurfs (MRIsCombine) it outputs to the working directory, but I didn't realize that that wasn't how regular calls to mris_convert worked. Will fix.

"""
path, base = os.path.split(out_name)
if path == '':
_, in_file = os.path.split(in_file)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using _ here instead of replacing path means that mris_convert outputs to the working directory by default. Is this correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With MRIsCombine, it outputs to the working directory if path information isn't provided. So I think that's what we want.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. You can just remove this line, then, since in_file isn't used at all otherwise. Actually, the whoe thing could be shortened to:

path, base = os.path.split(out_name)
if path == '' and base[:3] not in ('lh.', 'rh.'):
    base = 'lh.' + base
return os.path.abspath(os.path.join(path, base))

And if you prefer, you could just do this directly in _list_outputs, rather than override _associated_file. (That's up to you, though.)


For complete details, see the `mris_convert Documentation.
<https://surfer.nmr.mgh.harvard.edu/fswiki/mris_convert>`_

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here might be a good place to add a comment about outputs (so it shows up in the docs):

If given an out_file that does not begin with 'lh.' or 'rh.', mris_convert will prepend 'lh.'
to the file name.
To avoid this behavior, consider setting out_file = './<filename>', or leaving out_file blank.

@effigies
Copy link
Member

@satra would you mind having a look at this one? Here's another case where we need to do a little pre-run fiddling to get FreeSurfer commands to run nicely in a Node, so I've suggested injecting a hook for FSSurfaceCommands that goes just before _get_filecopy_info.

tsalo added 2 commits April 18, 2017 11:27
In freesurfer/base/FSSurfaceCommand:
- Remove unnecessary __init__
- Move _get_filecopy_info docstring to _normalize_filenames
- Add to _associated_files docstring
- Output files with no path info go to in_file’s directory, not working
directory.

In freesurfer/utils/MRIsCombine:
- Add to MRIsCombine dosctring
@effigies
Copy link
Member

LGTM. Assuming tests pass, and there are no objections, I'll merge tonight or in the morning.

Thanks for your patience.

base = 'lh.' + base
return os.path.abspath(os.path.join(path, base))

def _normalize_filenames(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it says this function is used in the Node context, but where is it used inside Node?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks.

@satra
Copy link
Member

satra commented Apr 19, 2017

@effigies , @tsalo - in general it looks good - i left a question about Node context.

@satra satra merged commit 378e295 into nipy:master Apr 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants