Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DataContainer superclass #275

Merged
merged 3 commits into from
Jul 17, 2024

Conversation

psomhorst
Copy link
Contributor

@psomhorst psomhorst commented Jul 12, 2024

Closes #272.

In this implementation, the superclass is not very functional. It only inherits from Equivalence, removing the requirement therefore from the subclasses.

Ideally, the superclass would take define some common fields (attributes). However, any field (attribute) defined in the superclass would be first in de init function. label and name have default values in EITData, which would not work with non-default fields like path. Also, path would no longer be the first argument, potentially breaking existing code. To move fields toDataContainer, they would have to be the first arguments, or we would have to switch to keyword-only arguments.

This PR also removes superfluous inheritance from Sequence.

@psomhorst psomhorst changed the base branch from main to develop July 12, 2024 12:36
@psomhorst psomhorst requested a review from DaniBodor July 12, 2024 12:50
@psomhorst psomhorst added this to the Next release v1.2.0 milestone Jul 12, 2024
@psomhorst
Copy link
Contributor Author

@DaniBodor This is a super simple PR. I'd like not to complicate this PR, so we can use this superclass for #269.

Copy link
Member

@DaniBodor DaniBodor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. This is a great starting point from where the DataContainer class can be extended in the future.
In my mind, the methods from the current mixin-classes can ultimately be added to DataContainer rather than inheriting them one by one. But that can be done in a future PR

@DaniBodor
Copy link
Member

DaniBodor commented Jul 17, 2024

Ideally, the superclass would take define some common fields (attributes). However, any field (attribute) defined in the superclass would be first in de init function. label and name have default values in EITData, which would not work with non-default fields like path. Also, path would no longer be the first argument, potentially breaking existing code. To move fields toDataContainer, they would have to be the first arguments, or we would have to switch to keyword-only arguments.

I agree with this, and I think this should be implemented. I also think that the first fields should be identical for each container, and then the specific arguments per container should follow after that. Also, I think that nomenclature should be aligned more. For example, vendor and pixel_impedance in EITData in my mind is the equivalent of category and values (respectively) in the other containers.

Regarding breaking existing code, this is an argument that you have made a few times now (e.g. in issue #262). I don't think we have a large enough user base at this point to make that a prohibitive reason not to change things. We are very much still in the developmental phase of this project, and it's more important to "get things right", even if that means breaking changes, than ensuring backwards compatibility.

I think these kinds of changes can be implemented in a separate PR though, so that this one can stay light, as you suggest.

EDIT: I will copy these points to a new issue (#278), so that we can keep track more easily once this PR is closed. Let's try to discuss further there.

@@ -19,7 +19,7 @@ class Interval(NamedTuple):


@dataclass(eq=False)
class IntervalData(Equivalence, SelectByIndex, HasTimeIndexer):
class IntervalData(DataContainer, SelectByIndex, HasTimeIndexer):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my mind, it would make sense to add SelectByTime to DataContainer as well. If selecting by time is not possible for IntervalData (it should be, though, shouldn't it?*), then you can overwrite the method here to return an error.

*: It's still not clear to me why the HasTimeIndexer object is separate from the TimeIndexer itself. Also, given that the object inherits both SelectByIndex and HasTimeIndexer, I think this means that it may as well inherit SelectByTime.

Copy link
Contributor Author

@psomhorst psomhorst Jul 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree we should have a class inherit functionality to then disable it again. It's possible, but it makes the code less readable. I'll look into it when looking into homogenisation (#279).

If a class inherits from HasTimeIndexer it gets an attribute that is an instance of TimeIndexer. You can't inherit from a class and while inheriting an instance of itself as an attribute.

@psomhorst
Copy link
Contributor Author

I agree with this, and I think this should be implemented. I also think that the first fields should be identical for each container, and then the specific arguments per container should follow after that.

I created an issue for this. We can discuss how to approach this in #279.

Also, I think that nomenclature should be aligned more. For example, vendor and pixel_impedance in EITData in my mind is the equivalent of category and values (respectively) in the other containers.

category is going to point to what type of data is stored in an object (e.g. impedance, pressure or breaths), so algorithms can check whether they support this type of data. I'm working on a proposal for a proper implementation of categories. A vendor is not a data type, and should not be a category.

We decided to explicitly make EITData.pixel_impedance different from values because it is not immediately apparent what the values are you get from EITData.values. Is it pixel impedance, or functional impedance, or...? I think one advantage of data-specific classes is that you can customize things like this.

Regarding breaking existing code, this is an argument that you have made a few times now (e.g. in issue #262). I don't think we have a large enough user base at this point to make that a prohibitive reason not to change things. We are very much still in the developmental phase of this project, and it's more important to "get things right", even if that means breaking changes, than ensuring backwards compatibility.

I agree the user base is not large, but it is annoying enough for a few people here working on multiple projects at the same time. I think we should at least be more clear about what breaks and how to fix it in release notes, and minimise the number of times things break.

@psomhorst psomhorst merged commit c12120c into develop Jul 17, 2024
5 checks passed
@psomhorst psomhorst deleted the 272_datacontainer_superclass_psomhorst branch July 17, 2024 08:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add data container superclass
2 participants