Data layout proposal #42

dpole · 2020-06-30T14:58:11Z

This PR proposes an implementation of the data layout proposed in #35.

Summary

an Observation contains multiple detectors
information about the detectors
- become an attribute of the Observation, if shared by the detectors
- are stacked in an array, which become an attribute of the observation. Notably, obs.tod is a 2d array with shape (n_det, n_sample)
Detector is not needed anymore

Motivation

Stacking data that is likely to interact into regular data-structures typically allows easier APIs (and sometimes faster computation). E.g. obs.tod.mean(0) computes the common mode; C.dot(obs.tod) applies the cross-talk matrix C using threaded and highly optimized math libraries like MKL.

Perspective

Prefer shared-memory parallelism over MPI parallelism: the former typically have no or minimal API overhead compared to serial code (and sometimes better performance compared to in-node MPI parallelization). In this perspective, it is better to have as much data as possible packed in a compact data structure.

Next steps (maps data)

In this perspective, operations on an Observation should avoid MPI parallelization and therefore TOD-to-map accumulation should be a serial operation (maps are not distributed, every MPI rank holds an entire map). This should be possible as LiteBIRD maps should take only a limited fraction of any foreseen node size

dpole · 2020-06-30T15:10:25Z

@ziotom78 I think I need a feedback to make sure I'm not breaking your pieces of code -- in addition to a general feedback on the model

ziotom78

Hi @dpole , sorry for the delay in reviewing this. There are a few problems in the code, I have tried to be as clear as possible in detailing what's probably causing them.

litebird_sim/__init__.py

litebird_sim/observations.py

litebird_sim/simulations.py

test/mpi/test_mpi.py

litebird_sim/simulations.py

mreineck · 2020-07-23T06:54:45Z

From your presentation at the telecon I got the impression that you expect TOD to be processed in chunks of several hours or so. Maybe it's not really a concern, but this is somewhat at odds with the way that TOD are simulated from 4pi convolved data cubes.

The 4pi convolver has the following aspects:

it inputs the sky and beam a_lm
from those it generates the interpolation data cube (fairly expensive operation). The size of the data cube is 10-100 times the size of a sky map
given a set of (theta, phi, psi) pointings, it computes the TOD values; this operation is quick and scales with the length of the TOD. This can be called repeatedly with different pointing arrays.

As you see, there is a fairly large starting overhead and memory cost associated with this kind of TOD generation. It will definitely not be economical to setup and destroy the interpolator object for a few hours' worth of TOD.
On the other hand, given the size of the data cubes, it will not be possible either to hold all data cubes for all detectors in the memory of a single node. So, it will not be possible to do a full focal plane simulation on a single node, no matter how finely the TOD is chunked.

mreineck · 2020-07-23T07:44:21Z

Sorry for not raising this point earlier ... I only realized the intended simulation strategy when I heard "we can fit several hours of full focal plane TOD and hundreds of maps onto a single compute node".

dpole · 2020-07-23T09:57:41Z

Hi Martin, no problem. The important think is that we caught this point before a decision. I think that your application isn't necessarily in contrast with the current setup.

Just to clarify one thing: nothing prevents a specific application from having observations with a single detector in the TOD. The constrain that this PR adds is:

if many detectors are in the same node, you can assume that they are organized in a 2d array
codes that involve communications between detectors can assume shared memory (off-node communications between detectors or time chunks will be rare)

Even if we decide not to go for this, we have to be aware that we are implicitly agreeing that any interaction between detectors has to pass through one (or many) MPI call.

Of course if all of our application involve communications across the full-mission time (like your case) it is better to assume that a processor holds the full tod of a mission. However, I think that interactions between detectors' tod will be frequent, and therefore it makes sense to strive for the data model I propose.

Accommodating cases that work over the full tod requires a bit of a stretch. In my mind tods arrays are blocks of a block-cyclic distributed matrix that can be reshuffled at runtime (even though it should be avoided, whenever possible).

What do you think?

mreineck · 2020-08-04T10:09:00Z

It's hard for me to judge how simulations will be carried out for LiteBIRD (e.g. how many detectors will be processed simultaneously, how long the chunks of simulated TOD are, etc.).
What I can say is the following: if we assume that the "source" of the TOD in a pipeline is the 4pi convolver, and if we ignore complicated MPI communications for a moment, then it will not be possible to process more than, say, 100 detectors on a single node, even if that node is quite big. There is no real way around this: if you want to be able to ask "give me the TOD of this detector in the time range from A to B", then memory of roughly 1GB per detector is required on the node to compute this information, independent of the time range.
It's of course always possible to generate the TOD and write them to disk in a separate run; then they can be loaded in chunks as needed. But I think this is not the desired application scenario, since the data volumes will be quite big and I/O can be a bottleneck.

dpole · 2020-08-04T11:21:13Z

Ok, 1GB/detector is the requirement to keep in mind. I think that it is now clear why 4pi convolution prefers to have the entire tod of a detector in the node memory.

I think (and please anybody feel free to object) that this is not incompatible with this PR. The proposal is not to force chunking tods in time such that you have all the detectors in a single processor. The proposal is more "if you do have more detectors in the local memory, their tod should be aligned in memory".

As I said before, if all the application will be like the 4 pi convolution, it isn't worth bothering with this requirement. However, my impression is that it is not the case: only few applications require a full mission tod in memory and the distribution of detectors across nodes.

My understanding is that only beam studies require 4pi convolution. Most of the other studies can assume symmetric beams. So glitches, calibration drifts, crosstalks, etc. don't require extremely long tod chunks to be in the local memory. I believe that the best way to go is to implement this data model and provide a utilities to resize the chunks along the time and detector axes.

This adds some complication, but still I think it is less complicated than the alternative (i.e. requiring the time axis to be a very long, potentially a single, chunk and each line of the detector axis to be distributed), which implies that any user attempting operations that involve multiple detectors have to use an MPI interface.

mreineck · 2020-08-04T11:37:48Z

Ok, 1GB/detector is the requirement to keep in mind.

To be more specific, the rough formula is
lmax**2 * (2*beam_mmax+1) * 60 bytes
for maximum multipole moment lmax and maximum azimuthal beam moment beam_mmax.

Depending on the required resolution, this can vary quite a bit, and if one doesn't need high resolution, fairly many convolution data cubes will fit onto a node.

I agree that this scenario is not incompatible with the PR.

litebird_sim/simulations.py

giuspugl · 2020-08-05T16:27:17Z

My understanding is that only beam studies require 4pi convolution. Most of the other studies can assume symmetric beams. So glitches, calibration drifts, crosstalks, etc. don't require extremely long tod chunks to be in the local memory. I believe that the best way to go is to implement this data model and provide a utilities to resize the chunks along the time and detector axes.

What if we want to simulate all the systematics effects in one single simulation, i.e. combine beam asymmetries , HWP imperfections, bandpass mismatches, cosmic rays, etc.. ? My understanding is that this is something very needed for Litebird... I would opt more for a data layout which could account for this option too..

ziotom78 · 2020-08-05T17:01:26Z

What if we want to simulate all the systematics effects in one single simulation, i.e. combine beam asymmetries , HWP imperfections, bandpass mismatches, cosmic rays, etc.. ? My understanding is that this is something very needed for Litebird... I would opt more for a data layout which could account for this option too..

True, we have the task to provide the code to run end-to-end simulations, and this issue is going to be relevant in that context. However, there are other points to consider:

I don't expect that complete end-to-end simulations are going to be ran often; in fact, I expect that 99% of the users in LiteBIRD will use our framework just to run specific kinds of simulation. I remember running end-to-end simulations for Planck/LFI, and it was a nightmare to understand the results… too many effects, too complex outputs, it was always very difficult to pinpoint the cause of problems in the final data products, and very often I turned down to re-run simulations with only a few systematic effects turned on. This led to a huge amount of wasted CPU time. I think we should carefully plan how often we're going to run end-to-end simulations in LiteBIRD, and to be sure not to run them too often: ideally, only once we have ran separate simulations for each effect and we're reasonably sure that the results are what we expect, we can think of starting an end-to-end run.
The two approaches to data layout have advantages and disadvantages. The one proposed in this PR is perfect for correlated effects (thermal drifts, cosmic rays, …), while the one in the master branch is better for beam convolution. I don't think that we're going to find the best data model that fits any situation, so we should aim to find the one that permits to do most of the stuff adequately. The advantage of this PR is that it can be used similarly to the approach followed in the master branch (if you use a Nx1 matrix of samples in each observation, you can easily store a full-length TOD in memory), but it enables other use cases.

dpole · 2020-08-10T10:21:26Z

I've created a page about this data model in the new readthedocs website https://litebird-sim.readthedocs.io/en/multi_det_obs/data_layout.html

It is meant to stay there for reference if and when this PR will be merged

dpole · 2020-08-10T10:28:23Z

The main (only?) missing piece in this PR is how quantities different form the TOD are loaded into the observations. I probably need help from @ziotom78 or @paganol .
I've prototyped something like

for field in fields:
    field_arr = np.array([some_method(field, d) for d in detectors])
    setattr(self, field, field_arr)

in the Observation constructor, I'm not sure about how compatible this pattern is with the rest of the framework (and some_method is of course to be written).

ziotom78 · 2020-08-10T10:52:42Z

The main (only?) missing piece in this PR is how quantities different form the TOD are loaded into the observations. I probably need help from @ziotom78 or @paganol .
I've prototyped something like
for field in fields:
    field_arr = np.array([some_method(field, d) for d in detectors])
    setattr(self, field, field_arr)
in the Observation constructor, I'm not sure about how compatible this pattern is with the rest of the framework (and some_method is of course to be written).

For pointings (PR#48) I add to create this kind of «field», because the PR introduces pointing information in the form of quaternions just initialize self.bore2ecliptic_quats to None in the constructor Observation.__init__ and then I set them once Observation.generate_pointing_information is called.

The biggest advantage of this approach over setattr is clarity: once you create an object, you immediately have the field names defined in your object, so that once you create an instance of Observation like

obs = Observation(…)

you can discover the name of the fields by writing obs. and hitting TAB in Jupyter, but you do not allocate memory for stuff you do not need. Moreover, you can search for these fields in the source code of the class, because they are always initialized explicitly (e.g., just search for self.bore2ecliptic_quats in the code to find where the quaternions are generated/used).

dpole · 2020-08-10T17:15:32Z

I see. What you are saying actually simplifies quite a bit the role of this PR: the data layout becomes a convention that users should follow and not something enforced by the Observation API.

I think that we should wrap this Observation pattern/API in a dedicated issue, as it is an important choice that applies to any information that we store in the Observation. I presume it is better if you do it, as you probably have a better overall view, but I'm happy to try to do it myself, if you prefer.

ziotom78 · 2020-08-10T18:29:15Z

I see. What you are saying actually simplifies quite a bit the role of this PR: the data layout becomes a convention that users should follow and not something enforced by the Observation API.

Yes, that's right, I plan to use the same approach for pointings in PR #48 once this one is merged.

I think that we should wrap this Observation pattern/API in a dedicated issue, as it is an important choice that applies to any information that we store in the Observation. I presume it is better if you do it, as you probably have a better overall view, but I'm happy to try to do it myself, if you prefer.

Well, the most important piece of information in the Observation class is the TOD; any other timestream (quaternions, pixel indices, etc.) should follow the TOD. I believe it's ok if we continue the discussion in this PR: it has received so much advertisement in the last weeks that many people in the ST are probably used to check this page.

The main purpose of Simulation.create_observations was to scatter observations across processes. Now this operation is done within the Observation initialization and create_observation would become just a wrapper to it.

It would be great to have a new review on the udpated code

dpole · 2020-09-08T15:59:22Z

All the key elements of this PR are in place. We have to address the conflicts with master, but I think that the priority is to review the API changes proposed, of the Observation class in particular.

https://litebird-sim.readthedocs.io/en/multi_det_obs/observations.html#litebird_sim.observations.Observation

Happy to add clarifications and/or examples either here or in the doc wherever needed.

test_mpi is now in the base test folder. The reason is that the MPI tests should behave correctly also in serial runs. Therefore, they are executed with all the other tests by pytest

dpole · 2020-09-14T08:33:49Z

litebird_sim/observations.py

-        self, detector, start_time, sampling_rate_hz, nsamples, use_mjd=False
+        self, detectors, n_samples,
+        start_time, sampling_rate_hz, use_mjd=False, dtype_tod=np.float32,
+        n_blocks_det=1, n_blocks_time=1, comm=None, root=0


Add possibility to not allocate tod

dpole · 2020-09-14T09:05:33Z

litebird_sim/observations.py

+        assert len(info) == len(self.tod)
+        setattr(self, name, info)

    def get_times(self):


Check that works also if tods are not allocated

litebird_sim/simulations.py

litebird_sim/observations.py

dpole · 2020-09-22T20:37:40Z

I think that the new layout can be merged and used. There is still some stuff to tidy up (e.g. naming conventions, extract some methods from Observation and maybe enhance the doc) but we've decided to keep working on it later.

Draft of new data layout

08c1383

dpole requested a review from ziotom78 June 30, 2020 15:08

dpole added 5 commits June 30, 2020 17:32

Fix files loss

50abbab

Fix files loss

79d8c92

Fix files loss

8a78e12

Fix?

00becfe

Fix file loss due to hooks bug

e7f764c

dpole mentioned this pull request Jul 6, 2020

Implement the classes Imo and ImoFlatFile #33

Merged

4 tasks

Detectors back, not in Observation

dc4c0ac

ziotom78 previously requested changes Jul 22, 2020

View reviewed changes

WIP: Misc fixes after Maurizio review

da96901

ziotom78 reviewed Aug 5, 2020

View reviewed changes

litebird_sim/simulations.py Outdated Show resolved Hide resolved

dpole added 4 commits August 6, 2020 12:34

Merge branch 'master' into multi_det_obs

a7a13a0

Data layout doc

840fea4

Fix test and rm docs/build

f0b3ffc

Fix typos

f99c891

dpole added 2 commits September 8, 2020 17:28

get_times get the local times

bdfb3ef

Rremove create_observations from Simulation

d187a23

The main purpose of Simulation.create_observations was to scatter observations across processes. Now this operation is done within the Observation initialization and create_observation would become just a wrapper to it.

dpole requested a review from ziotom78 September 8, 2020 15:36

Fix get_times doc

6348e5e

dpole added 5 commits September 9, 2020 12:53

Remove mpi test folder

943fa13

test_mpi is now in the base test folder. The reason is that the MPI tests should behave correctly also in serial runs. Therefore, they are executed with all the other tests by pytest

Fix tests?

28d607d

Fix no-mpi test

6739c0f

Fix .travis.yaml typo

2d2b5f0

No oversubscribe in openmpi test

5bf2264

dpole commented Sep 14, 2020

View reviewed changes

litebird_sim/simulations.py Show resolved Hide resolved

ziotom78 reviewed Sep 14, 2020

View reviewed changes

litebird_sim/observations.py Outdated Show resolved Hide resolved

dpole added 8 commits September 15, 2020 17:51

Merge branch 'master' into multi_det_obs

c5f5c38

WIP: Fix tests

4f6d0b4

Fix tests

004851d

allocate_tod option

0f0e874

Fix contentui dep

e4b9a67

Fix contentui dep

5015b3f

Fix docs and some arg names

9e30cc2

Fix det_idx init

9901713

Det info not added if attr already present

75380e6

ziotom78 merged commit fa3cd39 into master Sep 23, 2020

ziotom78 deleted the multi_det_obs branch September 23, 2020 08:38

This was referenced Sep 23, 2020

Detectors as indices of arrays instead of objects #35

Closed

Implement a way to split observations among different MPI processes #14

Closed

ziotom78 mentioned this pull request Oct 16, 2025

Try the reshuffling algorithm with BrahMap #468

Open

Data layout proposal #42

Data layout proposal #42

Uh oh!

Conversation

dpole commented Jun 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Perspective

Next steps (maps data)

Uh oh!

dpole commented Jun 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ziotom78 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mreineck commented Jul 23, 2020

Uh oh!

mreineck commented Jul 23, 2020

Uh oh!

dpole commented Jul 23, 2020

Uh oh!

mreineck commented Aug 4, 2020

Uh oh!

dpole commented Aug 4, 2020

Uh oh!

mreineck commented Aug 4, 2020

Uh oh!

Uh oh!

giuspugl commented Aug 5, 2020

Uh oh!

ziotom78 commented Aug 5, 2020

Uh oh!

dpole commented Aug 10, 2020

Uh oh!

dpole commented Aug 10, 2020

Uh oh!

ziotom78 commented Aug 10, 2020

Uh oh!

dpole commented Aug 10, 2020

Uh oh!

ziotom78 commented Aug 10, 2020

Uh oh!

dpole commented Sep 8, 2020

Uh oh!

dpole Sep 14, 2020

Choose a reason for hiding this comment

Uh oh!

dpole Sep 14, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dpole commented Sep 22, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dpole commented Jun 30, 2020 •

edited

Loading

dpole commented Jun 30, 2020 •

edited

Loading