Skip to content
This repository was archived by the owner on Oct 24, 2024. It is now read-only.
This repository was archived by the owner on Oct 24, 2024. It is now read-only.

API for filtering / subsetting #79

Closed
@TomNicholas

Description

@TomNicholas

So far we've only really implemented dictionary-like get/setitem syntax, but we should add a variety of other ways to select nodes from a tree too. Here are some suggestions:

class DataTree:
    ...

    def __getitem__(self, key: str) -> DataTree | DataArray:
        """
        Accepts node/variable names, or file-like paths to nodes/variables (inc. '../var').

        (Also needs to accommodate indexing somehow.)
        """
        ...

    def subset(self, keys: Sequence[str]) -> DataTree:
        """
        Return new tree containing only nodes with names matching keys.

        (Could probably be combined with `__getitem__`. 
        Also unsure what the return type should be.)
        """
        ...

    @property
    def subtree(self) -> Iterator[DataTree]:
        """An iterator over all nodes in this tree, including both self and all descendants."""
        ...

    def filter(self, filterfunc: Callable) -> Iterator[DataTree]:
        """Filters subtree by returning only nodes for which `filterfunc(node)` is True."""
        ...

Are there other types of access that we're missing here? Filtering by regex match? Getting nodes where at least one part of the path matches ("tag-like" access)? Glob?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions