Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-47475: Fix crash in some find_first queries in the new query system. #1115

Merged
merged 2 commits into from
Nov 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/changes/DM-47475.bugfix.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix a crash in the new Butler query system which happened in some conditions when using find-first option with multiple collections.
2 changes: 1 addition & 1 deletion python/lsst/daf/butler/_topology.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ class TopologicalFamily(ABC):
another's in a predefined way.

This hierarchy means that endpoints in the same family do not generally
have to be have to be related using (e.g.) overlaps; instead, the regions
have to be related using (e.g.) overlaps; instead, the regions
from one "best" endpoint from each family are related to the best endpoint
from each other family in a query.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -231,7 +231,7 @@ def into_from_builder(
self.select(postprocessing).cte() if cte else self.select(postprocessing).subquery()
)
return SqlJoinsBuilder(db=self.joins.db, from_clause=sql_from_clause).extract_columns(
self.columns, special=self.joins.special.keys()
self.columns, postprocessing, special=self.joins.special.keys()
)
return self.joins

Expand Down
37 changes: 37 additions & 0 deletions python/lsst/daf/butler/tests/butler_queries.py
Original file line number Diff line number Diff line change
Expand Up @@ -1228,6 +1228,43 @@ def test_materialization(self) -> None:
[1, 2, 3],
)

def test_materialization_find_first(self) -> None:
"""Test querying for datasets with find_first against a materialized
query.
"""
butler = self.make_butler("ci_hsc-subset.yaml", "ci_hsc-subset-skymap.yaml")

run = "HSC/runs/ci_hsc/20240806T180642Z"
extra_run = "HSC/runs/ci_hsc/20240806T180642Z-extra"

# Find few datasets to duplicate.
refs = butler.query_datasets("calexp", run, limit=3)
data_ids = [ref.dataId for ref in refs]

butler.collections.register(extra_run)
butler.registry.insertDatasets("calexp", data_ids, extra_run)

collections = [run, extra_run, "skymaps"]
with butler.query() as query:
query = query.join_dimensions(
[
"instrument",
"physical_filter",
"band",
"visit",
"detector",
"day_obs",
"skymap",
"tract",
]
)
query = query.join_dataset_search("skyMap", collections)
query = query.join_dataset_search("calexp", collections)
query = query.where({}, "instrument='HSC' AND skymap='discrete/ci_hsc'", bind=None)
m_query = query.materialize()
_ = list(m_query.datasets("skyMap", collections))
_ = list(m_query.datasets("calexp", collections))

def test_timespan_results(self) -> None:
"""Test returning dimension records that include timespans."""
butler = self.make_butler("base.yaml", "spatial.yaml")
Expand Down
45 changes: 45 additions & 0 deletions tests/data/registry/ci_hsc-subset-skymap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Skymap definition from ci_hsc_gen3, for our tests we only need tracts,
# dimension records for patches are not included.
description: Butler Data Repository Export
version: 1.0.2
universe_version: 7
universe_namespace: daf_butler
data:
- type: dimension
element: skymap
records:
- name: discrete/ci_hsc
hash: !!binary |
Kt7swQ9e5zUif7fo1Onj3ezH+Ts=
tract_max: 1
patch_nx_max: 16
patch_ny_max: 16
- type: dimension
element: tract
records:
- skymap: discrete/ci_hsc
id: 0
region: !<lsst.sphgeom.ConvexPolygon>
encoded: 70cb48dd838019e83fe79b0e4ace0ae5bfcae4291ac61395bf4884794db329e93f68b2ba9327c2e3bfa603a921c61395bf0e047a6ce727e93fc46da7b1aac0e3bf8779dd59c921a03f513c3fa3b417e83f6ffa86675109e5bf06a12054c921a03f
- type: collection
collection_type: RUN
name: skymaps
host: null
timespan_begin: null
timespan_end: null
- type: dataset_type
name: skyMap
dimensions:
- skymap
storage_class: SkyMap
is_calibration: false
- type: dataset
dataset_type: skyMap
run: skymaps
records:
- dataset_id:
- !uuid '557e15c6-0529-4fc9-998b-b90a4750315e'
data_id:
- skymap: discrete/ci_hsc
path: skymaps/skyMap/skyMap_discrete_ci_hsc_skymaps.pickle
formatter: lsst.daf.butler.formatters.pickle.PickleFormatter
Loading