Skip to content

WIP: loadsave refactoring #319

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from
Closed

WIP: loadsave refactoring #319

wants to merge 8 commits into from

Conversation

effigies
Copy link
Member

A first pass at #317, because apparently I don't have real work to do.

This so far only addresses the load half of load/save, but puts into practice some of the ideas we talked about, e.g. the decorator. (At first I tried just using klass.files_types, but that broke .mgz.)

One thing this demonstrates is that the checks in guessed_image_type were contextual; moving these checks into the klass.is_image files doesn't produce standalone functions that you'd probably expect.

A possibility is to make @valid_exts more forceful and add an assertion around is_image.

I killed off some tests along with their functions, so those'll need replacing.

@bcipolli
Copy link
Contributor

bcipolli commented Jul 9, 2015

@effigies - one design concern that I came across while looking at implementing this myself:

valid_exts is actually already defined in the files_types within a class. We have a master list of classes within .imageclasses. So, we can use that to create the IMAGE_MAP variable without the decorator.

I prefer that design, as it reuses the existing data rather than duplicates the extension information.

What do you think about migrating to that design?

@matthew-brett
Copy link
Member

Actually, files_types doesn't define all valid extensions, it only defines the 'characteristic' extensions for the files that make up the image. That's why the MGHFormat is a bit confusing, because the characteristic extension is '.mgh', present in files_types, but .mgz is also acceptable as an extension.

@effigies
Copy link
Member Author

effigies commented Jul 9, 2015

Tried that first, but the .mgz convinced me valid_exts was easier as a first pass. To use files_types, you'll need to account for that quirk.

Frankly not married to it, so do what you will.

On July 9, 2015 12:20:15 PM EDT, Ben Cipollini notifications@github.com wrote:

@effigies - one design concern that I came across while looking at
implementing this myself:

valid_exts is actually already defined in the files_types within a
class. We have a master list of classes within .imageclasses. So, we
can use that to create the IMAGE_MAP variable without the decorator.

I prefer that design, as it reuses the existing data rather than
duplicates the extension information.

What do you think about migrating to that design?


Reply to this email directly or view it on GitHub:
#319 (comment)

Chris Markiewicz

@bcipolli
Copy link
Contributor

bcipolli commented Jul 9, 2015

Is there any reason not to enumerate all valid file types in files_types? "Characteristic" can just mean... the first entry of each type (image, header) in the list.

@matthew-brett
Copy link
Member

files_types is also used to construct list of files that comprise the image. For example, when using the to_filename method of the image, we use files_types to generate the img and hdr filenames from the original passed filename. If you add another entry, you'll get another file saved with the given extension, when saving an image of that type.

@effigies
Copy link
Member Author

effigies commented Jul 9, 2015

One possibility is reducing valid_exts to additional extensions (so just .mgz) and constructing most of IMAGE_MAP from files_types.

The current construction is more explicit and eliminates the need for each image type to be imported into loadsave, but we could remove the redundancy which is really for just the one case.

On July 9, 2015 12:33:57 PM EDT, Matthew Brett notifications@github.com wrote:

files_types is also used to construct list of files that comprise the
image. For example, when using the to_filename method of the image,
we use files_types to generate the img and hdr filenames from the
original passed filename. If you add another entry, you'll get
another file saved with the given extension, when saving an image of
that type.


Reply to this email directly or view it on GitHub:
#319 (comment)

Chris Markiewicz

@matthew-brett
Copy link
Member

The special-casing for .mgz is horrible, it would good to find a better way. For example, at the moment, we need a special hack in the BinOpener class to recognize the .mgz extension as being a compressed image, and this is not even in the mghformat module.

@bcipolli
Copy link
Contributor

bcipolli commented Jul 9, 2015

We could just have files_types enumerate all, and have a function on SpatialImage that returns the "characteristic type" (finds all unique types in files_types, and returns the first extension for each).

Then just tweak to_filename to use that "characteristic" function.

For example, at the moment, we need a special hack in the BinOpener class to recognize the .mgz extension as being a compressed image, and this is not even in the mghformat module.

I also see that compression extensions are defined in quite a few places (in loadsave.py and on a few of the image classes, and in the Opener class). Perhaps a design motivated by the mgz issue could apply more broadly to deal with that as well.

Do image types that have separate header files allow compressed files? Perhaps they compress the image only, eh...

I will think about this.

@matthew-brett
Copy link
Member

Image types that allow compressed files compress both header and image.

@matthew-brett
Copy link
Member

Summary of files_types for my own benefit.

files_types is a tuple of (key, value) string pairs, where

  • key is the an image file type (just 'image' or 'header' or 'mat' so far);
  • value is the characteristic extension for the image file type.

Used for:

  • Characteristic extension for image class via SpatialImage.get_filename();
  • Sequence of filenames corresponding to a single image, given input single file name, via SpatialImage.filespec_to_filemap;
  • Sequence of filenames or file objects corresponding to a single image, given optional mapping from file type to file-name / file object via SpatialImage.make_file_map.

@matthew-brett
Copy link
Member

By the way, you may know that '.mgz' files can also have extension '.mgh.gz' - nibabel will open either:

https://surfer.nmr.mgh.harvard.edu/fswiki/FsTutorial/MghFormat

@bcipolli
Copy link
Contributor

at the moment, we need a special hack in the BinOpener class to recognize the .mgz extension as being a compressed image, and this is not even in the mghformat module.

I am looking at this now. @matthew-brett Why is BinOpener separate from Opener... and not in openers.py?

Also, any concerns if we (secretly) allow reading of .mgz.gz extensions, rather than explicitly rejecting it?

@matthew-brett
Copy link
Member

It's OK to read 'mgz.gz' - if a little weird.

BinOpener, as you see, is just a hack allowing '.mgz' as an extension indicating that the file is zip compressed - so it's in volumeutils only because the hack is not intrinsic to openers, but to images. It should probably go in the freesurfer modules, now I think about it (and the other classes use Opener)

@bcipolli
Copy link
Contributor

OK that's good to hear. I agree, and I think not too much work will clean things up nicely. I'll try it out now.

return False, sniff

fname = froot + ftypes['header'] if 'header' in ftypes else filename
if not sniff:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also check the size of the sniff

@bcipolli
Copy link
Contributor

@effigies I went through this today, seeing how things work if taking out the IMAGE_MAP and instead looping over the image classes / files_types (shortens the code, and avoids duplicating extension associations).

Found some things I left comments on here; please let me know if you think I'm off on any of the comments. I plan to try and work through those, and the other changes discussed, so no need to change them yourself. But ideas / feedback welcome!

@effigies
Copy link
Member Author

Glad to see you say that. Your comments made me worry you were treating this as the canonical PR for #317, while I've been operating under the assumption you have your own fork going on the issue. I've done a little more work trying to make is_image() stand on its own as a function, but that's not really in a great place.

I'll have a look and make comments.

@bcipolli
Copy link
Contributor

Thanks @effigies . I have a branch that should be pretty close today (though beefing up the tests would be good). Do you prefer I do a PR against your branch, or submit an independent PR here? The design is different enough that I think doing an independent PR makes sense, but thought I'd check with you since you did significant work on it too!

In case it helps:
https://github.com/nipy/nibabel/compare/master...bcipolli:reload?expand=1

@effigies
Copy link
Member Author

A new PR is fine with me.

matthew-brett added a commit that referenced this pull request Aug 27, 2015
MRG: More graceful handling of compressed file Opening

This PR deprecates BinOpener, instead exposing a decorator for creating arbitrary file extension / opener methods (similar to that proposed in #317 and created in #319).
@effigies effigies closed this Sep 1, 2015
matthew-brett added a commit that referenced this pull request Oct 6, 2015
MRG: Migrate load, save, class_map, ext_map to class properties and methods

This is to address #317 and #323 (and aims to supercede PR #319), both issues around image-specific code being distributed across files, rather than localized in the class definition.

Changes here include:
* Removing image-specific code from `nib.load` and `nib.save` functions, moving the logic into `ImageKlass.is_image` and `HeaderKlass.is_header` class functions.
* Deprecating `class_map` and `ext_map` objects, and moving necessary info to properties defined on the image classes themselves.
grlee77 pushed a commit to grlee77/nibabel that referenced this pull request Mar 15, 2016
MRG: More graceful handling of compressed file Opening

This PR deprecates BinOpener, instead exposing a decorator for creating arbitrary file extension / opener methods (similar to that proposed in nipy#317 and created in nipy#319).
grlee77 pushed a commit to grlee77/nibabel that referenced this pull request Mar 15, 2016
MRG: Migrate load, save, class_map, ext_map to class properties and methods

This is to address nipy#317 and nipy#323 (and aims to supercede PR nipy#319), both issues around image-specific code being distributed across files, rather than localized in the class definition.

Changes here include:
* Removing image-specific code from `nib.load` and `nib.save` functions, moving the logic into `ImageKlass.is_image` and `HeaderKlass.is_header` class functions.
* Deprecating `class_map` and `ext_map` objects, and moving necessary info to properties defined on the image classes themselves.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants