Define API for reader and image_mapper #143

TjarkMiener · 2024-09-12T15:48:48Z

This PR suggests APIs for the reader and image_mapper. The configuration system and component scheme of ctapipe is adopted. Besides we adopt astropy tables for the batch generation. Reading is working in monoscopic and stereoscopic mode for DL1 images and R1 waveforms. The code properly process data with different array and telescope (divergent) pointings.

A subclass designed for the adv. trigger system processing R0 waveforms will be added in a separated PR.

–––
Closes #31 #104

For looping over a given dl1 table and a single dl1 event (charges, peak times and mask), we can now retrieve the 2D images (input of the CNNs) without init and running the dl1dh reader.

also create separate function for the trigger patches on R0 data

remove also apply IM functions because it can be now replace by the internal .map_image() function

If prefix camera_frame is in the file, the user should add this prefix to the config file.

…lection

added batch generator removed dl1dh transform dl1dh provide now a static batch and we get the relevant information about the labels in the data loader of ctlearn

now stored in dl0 monitoring tree

last dimension of sample was missing

replaced by new design

Mainly astropy table operations are now used to retrieve the exmaple identifiers for the stereo reading mode. Code base is therefore heavily reduced and operations are more efficient. Moved parameter settings outside the dl1dh. User can request to also the read dl1b parameters by passing a list of column names in the batch_generation() Removed init skip when pandas hdf5 with example identifiers is provided. It is not needed anymore since we are now fast and efficient with astropy tables and their operations. split transformation into sub-functions for better readability.

this is removing redundant code astopy table operations should be used a retrieved sum(), min or max etc.

Everything related to the selection of the subarray is done with ctapipe now Whenever a new file is processed, it checks the consistency of the SubarrayDescription to the reference which is the first provided file; this ensures that all files have the subarray.

It defines a reading API with two childs for reading in mono and stereo mode. Then, the Image and Waveform childs inherits from both child classes (mono and stereo). Finally, the trigger child only inherits from the mono child

this is somehow needed because we modify the batch Table for the Trigger subclass fix trigger subclass

review-notebook-app · 2024-09-12T15:48:54Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Pablitinho

First revision round 🙂

dl1_data_handler/image_mapper.py

Pablitinho · 2024-09-16T09:12:55Z

dl1_data_handler/image_mapper.py

-                    self.image_shapes[camtype][1] + self.default_pad * 4,
-                    self.image_shapes[camtype][2],
-                )
+    def _get_virtual_pixels(self, x_ticks, y_ticks, pix_x, pix_y):


General suggestion: Add parameters meainig in "Doc Strings". You can use tools to generate it automatically in order to speed up this tedious task. Nobody like documentation 😂

I did not add the parameters for the internal functions of the ImageMapper class, since I thought it was not necessary. Only for the one (map_image()) that can be called from outside. If you are insisting, I can add them here as well ;-)

while you only need the public functions docstrings for user documentation, the dev documentation will greatly benefit of having members documented. I'm with @Pablitinho on this. Just use copilot, you will be surprised how smart it can be ;)

dl1_data_handler/image_mapper.py

dl1_data_handler/reader.py

removed magic numbers

Pablitinho

LGTM

mexanick

A few comments inline. It is quite difficult to comment on such a huge PR.
I suggest an implementation of a proper testing suite instead of a test notebook in order to automatize testing. Also, would be great to see a class diagram for the refactored ImageMapper showing what functionality is inherited, what is overloaded and what is extended.

mexanick · 2024-09-24T08:51:19Z

dl1_data_handler/image_mapper.py

+        self.camera_type = self.geometry.name
+        self.n_pixels = self.geometry.n_pixels
+        # Rotate the pixel positions by the pixel to align
+        self.geometry.rotate(self.geometry.pix_rotation)


What is the purpose of this? If I understand correctly, pix_rotation is an angle, at which every pixel is rotated, but not necessarily the camera, as for that one there's cam_rotation. From the comment above this line, perhaps cam_rotation shall be used instead?

mexanick · 2024-09-24T08:55:30Z

dl1_data_handler/image_mapper.py

-                    self.image_shapes[camtype][1] + self.default_pad * 4,
-                    self.image_shapes[camtype][2],
-                )
+    def _get_virtual_pixels(self, x_ticks, y_ticks, pix_x, pix_y):


while you only need the public functions docstrings for user documentation, the dev documentation will greatly benefit of having members documented. I'm with @Pablitinho on this. Just use copilot, you will be surprised how smart it can be ;)

mexanick · 2024-09-24T09:06:30Z

dl1_data_handler/image_mapper.py

+            self.geometry.pix_y.value, decimals=constants.decimal_precision
+        )
+
+        self.x_ticks = np.unique(self.pix_x).tolist()


Am I right you assume regularly spaced pixels for any kind of camera geometry? If not, did you test any geometry with shuffle step (e.g. square pixels where rows are shifted by e.g. 25%?)

mexanick · 2024-09-24T09:10:00Z

dl1_data_handler/image_mapper.py

+            self.pix_x, self.x_ticks = self._smooth_ticks(self.pix_x, self.x_ticks)
+            self.pix_y, self.y_ticks = self._smooth_ticks(self.pix_y, self.y_ticks)
+
+        # At the edges of the cameras some mapping methods run into issues.


Is this because your "ticks" maxes out at the maxima of pix_x, pix_y and do not take into account the pixel's area (border)?

mexanick · 2024-09-24T09:11:46Z

dl1_data_handler/image_mapper.py

+    def _create_virtual_hex_pixels(
+        self, first_ticks, second_ticks, first_pos, second_pos
+    ):
+        """Create virtual hexagonal pixels outside of the camera."""


(even inline) will help

mexanick · 2024-09-24T09:22:27Z

dl1_data_handler/image_mapper.py

+            **kwargs,
+        )
+
+        if geometry.pix_type != PixelShape.HEXAGON:


why one can't oversample a square pixel grid?

mexanick · 2024-09-24T09:23:04Z

dl1_data_handler/image_mapper.py

+            **kwargs,
+        )
+
+        if geometry.pix_type != PixelShape.HEXAGON:


same question as above

mexanick · 2024-09-24T09:30:09Z

dl1_data_handler/reader.py

+    ----------
+    quality_query : TableQualityQuery
+        An instance of TableQualityQuery to apply quality criteria to the data.
+    files : OrderedDict


Why do you need an OrderedDict here? I don't see any use of specific features for it in the code. On the other hand, a standard dict since python 3.6 retains original order of items.

mexanick · 2024-09-24T09:37:19Z

dl1_data_handler/reader.py

-                    }
-                )
+                def _multiplicity_cut_tel_type(table, key_colnames):
+                    self.min_telescopes_of_type.attach_subarray(self.subarray)


bad design, you modify here some objects that are out of scope for this function. Also you don't use key_colnames local variable.

mexanick · 2024-09-24T09:38:10Z

dl1_data_handler/reader.py

+            events = events.group_by(["obs_id", "event_id"])
+
+            def _multiplicity_cut_subarray(table, key_colnames):
+                return len(table) >= self.min_telescopes


why do you need key_colnames?

TjarkMiener added 26 commits July 19, 2024 21:00

moved _get_image oustide the dl1dh reader

1977662

For looping over a given dl1 table and a single dl1 event (charges, peak times and mask), we can now retrieve the 2D images (input of the CNNs) without init and running the dl1dh reader.

move _get_waveform() outside the dl1dh reader as well

144c5f5

also create separate function for the trigger patches on R0 data

removed flip for LST-1 real because it is not needed

4b2e0ef

remove also apply IM functions because it can be now replace by the internal .map_image() function

renamed get_mapped_trigger_patch to get_mapped_triggerpatch

c4888f6

remove trigger for stereo example description

e8e930f

remove prefix support for camera_frame

98821c0

If prefix camera_frame is in the file, the user should add this prefix to the config file.

rename and join parameter_selection and event_selection to quality_se…

fd5fc84

…lection

allow quality cuts for processing real data

3860d87

major refactoring for batch generation

f1a6fb1

added batch generator removed dl1dh transform dl1dh provide now a static batch and we get the relevant information about the labels in the data loader of ctlearn

removed redundant event and subarray info

679db24

edit path to pointing table

fc55628

now stored in dl0 monitoring tree

fix shape of trigger patch

a7755a3

last dimension of sample was missing

fix get trigger features

93a9c09

remove processor and transforms

b3ca9ed

replaced by new design

keep simulation info in an astropy table

8964328

this is removing redundant code astopy table operations should be used a retrieved sum(), min or max etc.

removed v5.0.0 support for real data and images

f174575

define reading API

1e3a86a

It defines a reading API with two childs for reading in mono and stereo mode. Then, the Image and Waveform childs inherits from both child classes (mono and stereo). Finally, the trigger child only inherits from the mono child

added classes into __all__

98862ef

pass batch to _get_features() and return feature dict and batch Table

c7f525e

this is somehow needed because we modify the batch Table for the Trigger subclass fix trigger subclass

remove redundant flip

cdb4674

make image mapper methos as API

7bacf27

simplify map_image()

d32d503

make reader as API

0e33f24

polish docstrings

ce5ee63

TjarkMiener added enhancement ctapipe Compatibility with ctapipe ready for review labels Sep 12, 2024

TjarkMiener requested a review from maxnoe September 12, 2024 15:48

TjarkMiener requested review from mexanick, Pablitinho, kosack and nietootein September 12, 2024 15:48

TjarkMiener self-assigned this Sep 12, 2024

TjarkMiener requested a review from BastienLacave September 12, 2024 15:50

removed multiple locks

fbdcb80

Pablitinho reviewed Sep 16, 2024

View reviewed changes

TjarkMiener added 5 commits September 18, 2024 15:32

remove magic numbers by constants

ec97818

calculate class weights with floats

c17ba56

get default image shape from the data

4d30599

removed magic numbers

use f-string

a99bd9b

make process type an enum

cf12cd0

TjarkMiener requested a review from Pablitinho September 19, 2024 08:01

Pablitinho approved these changes Sep 19, 2024

View reviewed changes

TjarkMiener mentioned this pull request Sep 20, 2024

(Implementation) ParticleNet model ctlearn-project/ctlearn#207

Open

mexanick reviewed Sep 24, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define API for reader and image_mapper #143

Define API for reader and image_mapper #143

TjarkMiener commented Sep 12, 2024

review-notebook-app bot commented Sep 12, 2024

Pablitinho left a comment

Pablitinho Sep 16, 2024

TjarkMiener Sep 18, 2024

mexanick Sep 24, 2024

Pablitinho left a comment •

edited

Loading

mexanick left a comment

mexanick Sep 24, 2024

mexanick Sep 24, 2024

mexanick Sep 24, 2024

mexanick Sep 24, 2024

mexanick Sep 24, 2024

mexanick Sep 24, 2024

mexanick Sep 24, 2024

mexanick Sep 24, 2024

mexanick Sep 24, 2024

mexanick Sep 24, 2024

Define API for reader and image_mapper #143

Are you sure you want to change the base?

Define API for reader and image_mapper #143

Conversation

TjarkMiener commented Sep 12, 2024

review-notebook-app bot commented Sep 12, 2024

Pablitinho left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Pablitinho left a comment • edited Loading

Choose a reason for hiding this comment

mexanick left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Pablitinho left a comment •

edited

Loading