Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IO tutorial #3098

Open
h-mayorquin opened this issue Jun 27, 2024 · 6 comments
Open

Add IO tutorial #3098

h-mayorquin opened this issue Jun 27, 2024 · 6 comments
Assignees
Labels
discussion General discussions and community feedback documentation Improvements or additions to documentation

Comments

@h-mayorquin
Copy link
Collaborator

h-mayorquin commented Jun 27, 2024

OK, I have been very busy but been meaning to this since #3053 and #2958

We should add an IO tutorial were we explain was is the way that we intended for spikeinterface objects to be saved.

This is a discussion post to discuss the details.

My opinion:

  • I think it should include a description of the two main functions to save. save_to_binary and save_to_zarr and what kind of arguments it supports.
  • A link to the write to nwb how to (on the slow makings by me!).
  • Their relationship to provenance, efficiency and a description of the formats. For the folder the structure should include the structure of the folder and for zarr the same equivalent tree (@alejoe91 )

Probably some of this information is distributed in the modules documentation. I will need to fish was already there and just add structure to it.

@h-mayorquin h-mayorquin added the documentation Improvements or additions to documentation label Jun 27, 2024
@h-mayorquin h-mayorquin self-assigned this Jun 27, 2024
@h-mayorquin h-mayorquin added the discussion General discussions and community feedback label Jun 27, 2024
@alejoe91
Copy link
Member

Also the structure of analyzers folders/part paths!

@JoeZiminski
Copy link
Collaborator

This sounds great!! I am not so familiar with save_to_binary and save_to_zarr, where does the recording.save() fit it? Are there any other saving functions?

@h-mayorquin
Copy link
Collaborator Author

save() is a convenience function router that ends up in one or the other through a rather complicated path that I aim to document as some point : )

@zm711
Copy link
Collaborator

zm711 commented Jun 28, 2024

I do think but I forget where this was stated (I think it was @JoeZiminski ), our docstring formatting injection really fails for save. Sometimes I try to remember what arguments I need for saving a sorting vs saving a recording and the docstring isn't perfect. So i really support an IO tutorial so that we at least lay it out! Thanks for writing this up @h-mayorquin !

@JoeZiminski
Copy link
Collaborator

JoeZiminski commented Jul 1, 2024

Great so just to review, ATM there is:

  1. si.write_binary_recording (writes recording to a single .raw file with no spike-interface metadata).
  2. si.write_to_h5_dataset_format similar to write_binary_recording but so an h5 file
  3. recording.save_to_memory() I'm not so sure what this does but it looks very cool
  4. recording.save_to_binary() Saves to folder with data stored in binary + some spikeinterface metadata
  5. recording.save_to_zarr(). Same as above but with zarr
  6. the recording.save() frontend. Convenience function around the recording methods.

It's awesome that so many file writing methods are supported. I wonder if these is some room for API optimisation, although it is certainly not simple. It is complicated by the fact that 1) There are different (all useful) ways to save the data, as a standalone file (binary, h5) or in "spikienterface-format", and that these functions all require different kwarg sets. Initially I thought it would be nice to route everything through recording.save() and make everything else private, but the differing kwarg sets make this impossible.

Some ways to streamline might be: make a distinction between spikeinterface-style saving (e.g. save_as_spikeinterface_format(format="binary") (with a better name) to distinguish it from the standalone binary write_binary_recording as easy to get confused between these. It might also be worth moving write_binary_recording and write_to_h5_daaset_format to the recording object so everything is in one place? and somehow incorporating these into the save() function? (these could be the front-end interface for these functions discussed in #2958).

I'm not 100% sure on the above, the number 1 thing to help make all this clear will be this IO tutorial, it will be super useful!

@h-mayorquin
Copy link
Collaborator Author

Related to here:
#3111

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion General discussions and community feedback documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

4 participants