This repository was archived by the owner on Oct 24, 2024. It is now read-only.
This repository was archived by the owner on Oct 24, 2024. It is now read-only.
API for filtering / subsetting #79
Closed
Description
So far we've only really implemented dictionary-like get/setitem
syntax, but we should add a variety of other ways to select nodes from a tree too. Here are some suggestions:
class DataTree:
...
def __getitem__(self, key: str) -> DataTree | DataArray:
"""
Accepts node/variable names, or file-like paths to nodes/variables (inc. '../var').
(Also needs to accommodate indexing somehow.)
"""
...
def subset(self, keys: Sequence[str]) -> DataTree:
"""
Return new tree containing only nodes with names matching keys.
(Could probably be combined with `__getitem__`.
Also unsure what the return type should be.)
"""
...
@property
def subtree(self) -> Iterator[DataTree]:
"""An iterator over all nodes in this tree, including both self and all descendants."""
...
def filter(self, filterfunc: Callable) -> Iterator[DataTree]:
"""Filters subtree by returning only nodes for which `filterfunc(node)` is True."""
...
Are there other types of access that we're missing here? Filtering by regex match? Getting nodes where at least one part of the path matches ("tag-like" access)? Glob?