Filter Operations on Label2DModel and Shape #946

selmanozleyen · 2025-06-30T13:22:13Z

I am working on this PR also and thought maybe it will be smart to have the code I use there supported here because they seem useful at first glance. So I wrote and tested subset_sdata_by_table_mask in this PR.

@LucaMarconato I can't ask for reviews other than @ilan-gold do you know why?

Some notes

For PointsModel (GeoDataFrame) it assumes the index is instance_id always
For Label2DModel the image is assumed to have the instance_ids as values themselves
For Label2DModel when the element is a xr.DataTree it assumes the keys are the different scales
I didn't use the relational operations to get the shapes and points for this reason because they assume the merge is on the element indices themselves
I didn't add support for AnnData because we would have to also document the return types and stuff in the case of AnnData input which doesn't seem reasonable to me if its going to make it 1-1 same as scanpy.pp.filter_cells

Code excerpt from tests to demonstrate the usage:

sdata = concatenate(
        {
            "labels": blobs_annotating_element("blobs_labels"),
            "shapes": blobs_annotating_element("blobs_circles"),
            "points": blobs_annotating_element("blobs_points"),
            "multiscale_labels": blobs_annotating_element("blobs_multiscale_labels"),
        },
        concatenate_tables=True,
    )
    third_elems = sdata.tables["table"].obs["instance_id"] == 3
    subset_sdata = subset_sdata_by_table_mask(sdata, "table", third_elems)

    labels_remaining_ids = set(np.unique(subset_sdata.labels["blobs_labels-labels"].data.compute())) - {0}
    assert labels_remaining_ids == {3}

    for scale in subset_sdata.labels["blobs_multiscale_labels-multiscale_labels"]:
        ms_labels_remaining_ids = set(
            np.unique(subset_sdata.labels["blobs_multiscale_labels-multiscale_labels"][scale].image.compute())
        ) - {0}
        assert ms_labels_remaining_ids == {3}

    points_remaining_ids = set(np.unique(subset_sdata.points["blobs_points-points"]["instance_id"].compute())) - {0}
    assert points_remaining_ids == {3}

    shapes_remaining_ids = set(np.unique(subset_sdata.shapes["blobs_circles-shapes"].index)) - {0}
    assert shapes_remaining_ids == {3}

for more information, see https://pre-commit.ci

codecov · 2025-06-30T13:28:53Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.44%. Comparing base (3f01688) to head (b4901cb).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #946      +/-   ##
==========================================
+ Coverage   92.36%   92.44%   +0.08%     
==========================================
  Files          48       48              
  Lines        7416     7470      +54     
==========================================
+ Hits         6850     6906      +56     
+ Misses        566      564       -2

Files with missing lines	Coverage Δ
src/spatialdata/__init__.py	`96.42% <ø> (ø)`
src/spatialdata/_core/query/relational_query.py	`92.44% <100.00%> (+1.31%)`	⬆️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

src/spatialdata/_core/query/masking.py

for more information, see https://pre-commit.ci

…om/selmanozleyen/spatialdata into feature/filter_operations_on_label

for more information, see https://pre-commit.ci

timtreis

Can this PR make use of

spatialdata/src/spatialdata/_core/query/relational_query.py

Line 787 in 7604a3d

def match_sdata_to_table(

and

spatialdata/src/spatialdata/_core/query/relational_query.py

Line 757 in 7604a3d

def match_element_to_table(

?

src/spatialdata/_core/query/relational_query.py

timtreis · 2025-07-10T15:38:04Z

tests/core/query/test_relational_query_subset_sdata_by_table_mask.py

+from spatialdata.datasets import blobs_annotating_element
+
+
+def test_filter_labels2dmodel_by_instance_ids():


Parametrise, don't loop over inputs

the loop here is for the multiple scales. Or did you mean something else?

timtreis · 2025-07-10T15:38:18Z

tests/core/query/test_relational_query_subset_sdata_by_table_mask.py

+        preserved_ids = np.unique(labels_element[scale].image.compute())
+        assert filtered_ids == (set(all_instance_ids) - {2, 3})
+        # check if there is modification of the original labels
+        assert set(preserved_ids) == set(all_instance_ids) | {0}


Use np.testing instead

but these are all simple set comparisons. would have to order then compare the matrices which seems convoluted. But if you think the same still I can do it

timtreis · 2025-07-10T15:39:55Z

src/spatialdata/_core/query/relational_query.py

+
+@_filter_by_instance_ids.register(DataArray)
+def _(element: DataArray, ids_to_remove: list[int], instance_key: str) -> DataArray:
+    del instance_key


you mean why del instance_key? It is to explicitly clarify that I won't be using it for this dispatch

selmanozleyen · 2025-07-10T16:04:38Z

@timtreis About using def match_sdata_to_table( and the other. I noticed for shapes the index is assumed to be the instance_id but this doesn't match how the blobs are filled and would fail in the tests. It was documented that way so I assumed this was intentional but match_sdata_to_table wouldn't (and didn't) pass the current tests because it assumed the element index was the instance_id.

selmanozleyen · 2025-07-14T13:48:52Z

@timtreis now I make use of match_element_to_table. Other than one comment all seems resolved

selmanozleyen and others added 2 commits June 30, 2025 15:17

init

a28c7c9

[pre-commit.ci] auto fixes from pre-commit.com hooks

225d593

for more information, see https://pre-commit.ci

fix mypy linterrors

e549b4b

ilan-gold reviewed Jun 30, 2025

View reviewed changes

src/spatialdata/_core/query/masking.py Outdated Show resolved Hide resolved

selmanozleyen and others added 6 commits July 3, 2025 16:33

update the location and the design

2aad72b

[pre-commit.ci] auto fixes from pre-commit.com hooks

ef74057

for more information, see https://pre-commit.ci

update docs

d6e22cb

Merge branch 'feature/filter_operations_on_label' of https://github.c…

46c41db

…om/selmanozleyen/spatialdata into feature/filter_operations_on_label

make coverage 100/100 because why not

80d95a2

[pre-commit.ci] auto fixes from pre-commit.com hooks

4438605

for more information, see https://pre-commit.ci

selmanozleyen mentioned this pull request Jul 9, 2025

sq.pp.filter_cells for SpatialData scverse/squidpy#1011

Open

1 task

timtreis requested changes Jul 10, 2025

View reviewed changes

selmanozleyen added 2 commits July 10, 2025 17:48

fixed type annotation

4c927ee

dont compute eagerly use. delete other instance key for consistency

e9e0da2

update the tests and make sure we use match_element_to_table

7534c91

selmanozleyen requested a review from timtreis July 14, 2025 13:49

flying-sheep assigned selmanozleyen Aug 7, 2025

Merge branch 'main' into feature/filter_operations_on_label

b4901cb

selmanozleyen mentioned this pull request Aug 24, 2025

Some suggestions and proposals for annotations in SpatialData #975

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Filter Operations on Label2DModel and Shape #946

Filter Operations on Label2DModel and Shape #946

Uh oh!

selmanozleyen commented Jun 30, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jun 30, 2025 •

edited

Loading

Uh oh!

Uh oh!

timtreis left a comment

Uh oh!

Uh oh!

Uh oh!

timtreis Jul 10, 2025

Uh oh!

selmanozleyen Jul 10, 2025

Uh oh!

timtreis Jul 10, 2025

Uh oh!

selmanozleyen Jul 14, 2025

Uh oh!

timtreis Jul 10, 2025

Uh oh!

selmanozleyen Jul 10, 2025

Uh oh!

selmanozleyen commented Jul 10, 2025 •

edited

Loading

Uh oh!

selmanozleyen commented Jul 14, 2025

Uh oh!

Uh oh!

		from spatialdata.datasets import blobs_annotating_element


		def test_filter_labels2dmodel_by_instance_ids():

Filter Operations on Label2DModel and Shape #946

Are you sure you want to change the base?

Filter Operations on Label2DModel and Shape #946

Uh oh!

Conversation

selmanozleyen commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

timtreis left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

timtreis Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

selmanozleyen Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

timtreis Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

selmanozleyen Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

timtreis Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

selmanozleyen Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

selmanozleyen commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

selmanozleyen commented Jul 14, 2025

Uh oh!

Uh oh!

selmanozleyen commented Jun 30, 2025 •

edited

Loading

codecov bot commented Jun 30, 2025 •

edited

Loading

selmanozleyen commented Jul 10, 2025 •

edited

Loading