Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import implementation of ExperimentAxisQuery. #78

Merged
merged 6 commits into from
Jan 17, 2023
Merged

Conversation

thetorpedodog
Copy link
Contributor

As discussed in #76, the full implementation of ExperientAxisQuery should be provided by the base SOMA implementation since it does not vary from storage engine to storage engine. This copies most of the implementation from the existing TileDB version, with some reorganization to make the main query class a little smaller:

  • Pulls more of the caching-specific behavior of join IDs into the _JoinIDCache class.
  • Pulls more indexing-related work into the _AxisIndexer class.
    • Makes formerly-public AxisIndexer private. (If this should be public this can be easily reverted.)
    • Renames methods to by_[axis].
    • Separates out index caching logic from main methods.
  • Uses an _Axis enum internally for axis selection.
  • Shortens AxisQueryResult name.


# Internals

def read(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend prefixing this method with an underscore as it is internals. I originally omitted the underscore, as I wasn't sure if it was internal or public API

I am already confident that I need to significantly restructure read() and how it is used by to_anndata() for performance reasons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefixed both this and AxisQueryResult with _s.


"""
column_names = column_names or AxisColumnNames(obs=None, var=None)
x_collection = self.experiment.ms[self.measurement_name].X
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use self._ms property?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I missed this one. Thanks!



@attrs.define(frozen=True)
class AxisQueryResult:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is (I believe) a private class and should probably have a underscore-prefixed name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ibid.

Copy link
Member

@bkmartinjr bkmartinjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: suggest adding [lifecycle: experimental] tags to public methods (not just the top-level class)

Copy link
Member

@bkmartinjr bkmartinjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two changes suggested:

  • everything implementing the to_anndata code path (eg, read, AxisQueryResult, etc should have an under-score prefixed name as it is currently just private implementation, and very likely to change for performance reasons.
  • suggest pasting the lifecycle maturity tags on the public methods (eg, lifecycle: experimental)

Copy link
Contributor Author

@thetorpedodog thetorpedodog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: suggest adding [lifecycle: experimental] tags to public methods (not just the top-level class)

Added.


# Internals

def read(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefixed both this and AxisQueryResult with _s.


"""
column_names = column_names or AxisColumnNames(obs=None, var=None)
x_collection = self.experiment.ms[self.measurement_name].X
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I missed this one. Thanks!



@attrs.define(frozen=True)
class AxisQueryResult:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ibid.

As discussed in #76,
the full implementation of `ExperientAxisQuery` should be provided by
the base SOMA implementation since it does not vary from storage engine
to storage engine. This copies most of the implementation from the
existing TileDB version, with some reorganization to make the main
query class a little smaller:

- Pulls more of the caching-specific behavior of join IDs into the
  `_JoinIDCache` class.
- Pulls more indexing-related work into the `_AxisIndexer` class.
  - Makes formerly-public `AxisIndexer` private.
    (If this should be public this can be easily reverted.)
  - Renames methods to `by_[axis]`.
  - Separates out index caching logic from main methods.
- Uses an `_Axis` enum internally for axis selection.
- Shortens `AxisQueryResult` name.
- Removes defaults from `_read` so there is only one place where
  optional parameters are set.
- Fix up last use of `self.experiment.ms[self.measurement_name]`.
`contextlib.ContextManager` does not annotate its return type, so
callers of `with ExperimentAxisQuery(...) as q:` will see it as an Any
in type-checking. Instead we just write the `__enter__ return self`
ourselves.
@thetorpedodog thetorpedodog merged commit 9b8591a into main Jan 17, 2023
@thetorpedodog thetorpedodog deleted the experiment branch January 17, 2023 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants