Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reloading cnmf object returns bytes rather than string #1264

Open
JohnStout opened this issue Feb 1, 2024 · 8 comments
Open

Reloading cnmf object returns bytes rather than string #1264

JohnStout opened this issue Feb 1, 2024 · 8 comments

Comments

@JohnStout
Copy link

  1. Operating System (Linux, MacOS, Windows):
  2. Hardware type (x86, ARM..) and RAM:
  3. Python Version (3.11):
  4. Caiman version (1.9.16):
  5. How you installed Caiman (pure conda, conda + compile, colab, ..): conda and mamba
  6. Details:

When reloading the CNMFE model using load_CNMF, the cnmfe_model.params.data[fnames] is of type numpy.bytes and requires a reset prior to changing parameters and evaluating components cnmfe_model.params.data['fnames'] = [cnmfe_model.params.data['fnames'][0].decode('UTF-8')]

@EricThomson
Copy link
Contributor

Thanks I'll check this out.

@EricThomson
Copy link
Contributor

I reproduced this : definitely a problem if you try to change params on loaded cnmf object it gets angry because of the way it's represented. Not sure when we'll fix this, but a fix is definitely needed, thanks for pointing it out.

@JohnStout
Copy link
Author

Thanks Eric!

@ethanbb
Copy link
Contributor

ethanbb commented May 16, 2024

Oh I just ran into something related to this because a parameter with the value None is saving as b'NoneType' rather than 'NoneType' and not being converted back to None when loading. It looks like this is handled correctly for some parameters, but not ones that the function tries to convert from NumPy arrays to tuples. I wrote a fix and will make a PR as soon as the tests finish.

@ethanbb
Copy link
Contributor

ethanbb commented May 16, 2024

Although actually looking at that PR #1305, if I'm modifying recursively_load_dict_contents_from_group anyway maybe it would make sense to just convert any bytes objects to strings there?

@pgunn
Copy link
Member

pgunn commented May 16, 2024

Longer-term I think it'd be good to have all initialisers force everything to the exact type expected for whatever's being filled in, rather than whatever the serialisation expects. This will be a lot of work, but it will eliminate a source of unwanted variance in the code.

@ethanbb
Copy link
Contributor

ethanbb commented May 16, 2024

OK, I agree converting all the bytes arrays to strings would not work actually. Looking closer, I now see that it's specifically lists of strings or objects that can't be directly saved to an HDF5 dataset (such as slices), since scalar strings are stored as attributes. The one case I found where converting to strings would be incorrect is motion.indices, which has the following value after re-loading:

array([b'slice(None, None, None)', b'slice(None, None, None)', b'slice(None, None, None)'], dtype='|S32')

I guess the thing to do here would be to convert it back using tuple(eval(ind) for ind in indices), unless there's a less hacky way.

Also just want to note that check_consistency isn't currently called from load_CNMF - maybe that should be changed if we want to do validation and conversion within check_consistency (as in the last PR)?

@ethanbb
Copy link
Contributor

ethanbb commented May 16, 2024

Sorry I was wrong - I forgot that all strings are getting loaded as bytes, whether or not they're attributes. However they're mostly converted to strings in recursively_load_dict_contents_from_group, except for a specific set of keys, which is the bug I'm submitting this PR for (since that does not seem intentional).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants