Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add max_channel column for phy interface #961

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Next Next commit
add max_channel column for phy interface
  • Loading branch information
bendichter committed Jul 14, 2024
commit bd0222a28807c80023a2db5ccd60512cd6119de8
51 changes: 49 additions & 2 deletions src/neuroconv/datainterfaces/ecephys/phy/phydatainterface.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,11 @@
from typing import Optional
from pathlib import Path
from typing import Optional, Literal

import numpy as np
from pynwb.file import NWBFile

from ..basesortingextractorinterface import BaseSortingExtractorInterface
from ....utils import FolderPathType
from ....utils import FolderPathType, DeepDict


class PhySortingInterface(BaseSortingExtractorInterface):
Expand All @@ -23,6 +27,23 @@ def get_source_schema(cls) -> dict:
] = "Path to the output Phy folder (containing the params.py)."
return source_schema

def get_max_channel(self):
folder_path = Path(self.source_data['folder_path'])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring? It would be useful here to describe some of the Phy charateristics. Like, I read from the code below that the templates are stored unwithened, the axis operation would be clearer if you wrote the template shape somewhere (I think its shape is (num_templates, num_samples, num_channels), right? Plus, the definition of cluster id.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this could use a docstring


templates = np.load(str(folder_path / 'templates.npy'))
channel_map = np.load(str(folder_path / 'channel_map.npy'))
whitening_mat_inv = np.load(str(folder_path / "whitening_mat_inv.npy"))
templates_unwh = templates @ whitening_mat_inv

cluster_ids = self.sorting_extractor.get_property('original_cluster_id')
templates = templates_unwh[cluster_ids]

max_over_time = np.max(templates, axis=1)
idx_max_channel = np.argmax(max_over_time, axis=1)
max_channel = channel_map[idx_max_channel].ravel()

return max_channel

Copy link
Collaborator

@h-mayorquin h-mayorquin Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, two other questions here:

  1. Aren't the templates scaled? I am thinking on the amplitudes.npy . This would mean the argmax operation might not be right.
  2. Also, is max_channel meanigful if spikes are negative? I think here it is consistent but aren't most people interested on the absolute value largest. On the templates data base we use "best channel" which did not have that load.

Now I am aware that you are using machinery that is already there in the add_units_table but I wanted to point this out.

https://phy.readthedocs.io/en/latest/sorting_user_guide/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Scaled how? In a way that would affect which channel is selected?
  2. yeah, good point, the argmax(abs()) might be better

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I shared this with a friend that uses Phy and he flagged that. Reading the documentation I feel less certain.

We could do a roun-trip with spikeinterface artificial data and the export to phy functionality to see if your function gets it right.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some code by Nick Stteinmetz that calculates templates max without using the amplitudes (but they don't do the de-whitening step that you do)

https://github.com/cortex-lab/spikes/blob/master/analysis/templatePositionsAmplitudes.m

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think he does. Look at winv

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the block where the max channel is calculated take the temps as they come directly from the input:

https://github.com/cortex-lab/spikes/blob/fcea2b20e736b533e5baf612752b66121a691128/analysis/templatePositionsAmplitudes.m#L64-L71

Am I missing something?

def __init__(
self,
folder_path: FolderPathType,
Expand All @@ -41,3 +62,29 @@ def __init__(
verbose : bool, default: True
"""
super().__init__(folder_path=folder_path, exclude_cluster_groups=exclude_cluster_groups, verbose=verbose)

def add_to_nwbfile(
self,
nwbfile: NWBFile,
metadata: Optional[DeepDict] = None,
stub_test: bool = False,
write_ecephys_metadata: bool = False,
write_as: Literal["units", "processing"] = "units",
units_name: str = "units",
units_description: str = "Autogenerated by neuroconv.",
):

super().add_to_nwbfile(
nwbfile=nwbfile,
metadata=metadata,
stub_test=stub_test,
write_ecephys_metadata=write_ecephys_metadata,
write_as=write_as,
units_name=units_name,
units_description=units_description,
)

max_channel = self.get_max_channel()
nwbfile.units.add_column(name='max_channel', description='Channel with maximum amplitude', data=max_channel)

return nwbfile
Loading