Skip to content

Add image data representation for CNN backbones#892

Draft
sevmag wants to merge 1 commit into
graphnet-team:mainfrom
sevmag:features/image-data-representation
Draft

Add image data representation for CNN backbones#892
sevmag wants to merge 1 commit into
graphnet-team:mainfrom
sevmag:features/image-data-representation

Conversation

@sevmag
Copy link
Copy Markdown
Collaborator

@sevmag sevmag commented May 18, 2026

Summary

Second in the series of PRs being carved out of #813 (see the split tracking comment for the plan). This is the foundation for the CNN work — it adds an image-shaped data representation analogous to GraphDefinition, with no model code yet. PRs 3–4 (the actual CNN models) will build on this.

What's new

  • ImageRepresentation base class — analogue of GraphDefinition for image-shaped inputs. Builds per-channel image tensors from raw pulse data using a GridDefinition that maps each DOM to a voxel.
  • GridDefinition — abstract mapping from DOM string/dom_number to a 3D grid index. Two concrete subclasses:
    • IC86GridDefinition — IceCube IC86 array (main array + upper/lower DeepCore).
    • ExamplePrometheusGridDefinition — Prometheus example geometry.
      Mapping tables live in cnn_mapping_tables.py as Python literals (no parquet files).
  • IC86Image and ExamplePrometheusImage — concrete image representations bundling a detector and its grid definition.
  • New TEST_IMAGE_DIR / TEST_IC86*_IMAGE constants and .npy test fixtures for IC86 main array + DeepCore sub-arrays.
  • Unit tests for GridDefinition and ImageRepresentation (3 tests, all passing).

What's not in this PR

  • No CNN model code — that's PR 3 (IceCubeDNN) and PR 4 (LCSC).
  • No training example — that's PR 5.

Notes for review

  • All new files; only two existing files are touched (src/graphnet/constants.py to add 4 test-data path constants, and data_representation/__init__.py to re-export the new image classes).
  • The previous parquet-based mapping table approach was dropped in favour of inline Python literals (see cnn_mapping_tables.py).

Test plan

  • Pre-commit clean (black, flake8, docformatter, pydocstyle, mypy, EOL/whitespace)
  • pytest tests/models/test_grid_definition.py tests/models/test_image_representation.py — 3 passed
  • CI green

Split from #813.

🤖 Generated with Claude Code

@sevmag sevmag mentioned this pull request May 18, 2026
@sevmag sevmag marked this pull request as draft May 18, 2026 22:37
Introduce a new `data_representation/images/` module that lets
detectors be represented as multi-channel 3D image tensors rather than
graphs, as required by CNN backbones.

Components:

- `ImageRepresentation` base class — analogue of `GraphDefinition`
  for image-shaped inputs. Builds per-channel image tensors from raw
  pulse data using a `GridDefinition` that maps each DOM to a voxel.
- `GridDefinition` — abstract mapping from DOM string/dom indices to a
  3D grid. `IC86GridDefinition` covers the IceCube IC86 array
  (main array + upper/lower DeepCore), `ExamplePrometheusGridDefinition`
  covers the Prometheus example geometry. Mapping tables live in
  `cnn_mapping_tables.py` as Python literals (no parquet files).
- `IC86Image` and `ExamplePrometheusImage` — concrete image
  representations bundling a detector and its grid definition.

Also adds:

- `TEST_IMAGE_DIR` and the three `TEST_IC86*_IMAGE` constants in
  `graphnet.constants` for the new test fixtures.
- `.npy` test fixtures for the IC86 main array and the two DeepCore
  sub-arrays.
- Unit tests for `GridDefinition` and `ImageRepresentation`.

No model code yet — that lands in follow-up PRs (`IceCubeDNN`, `LCSC`).
Split from graphnet-team#813.
@sevmag sevmag force-pushed the features/image-data-representation branch from fc40d44 to d97b394 Compare May 19, 2026 12:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant