Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQW compression methodology to speed up I/O #1740

Open
Tracked by #1744
cmarooney-stfc opened this issue Sep 17, 2024 · 0 comments
Open
Tracked by #1744

SQW compression methodology to speed up I/O #1740

cmarooney-stfc opened this issue Sep 17, 2024 · 0 comments
Assignees

Comments

@cmarooney-stfc
Copy link
Collaborator

cmarooney-stfc commented Sep 17, 2024

Currently SQw objects write and store the computed pixel information i.e. qx, qy, qz, E which are $\mathcal{O}\left(n_{\text{run}} n_{\text{Efix}} n_{\text{det}} n_{\text{Ebin}} \right) \times 4$ data.

Instead, it is possible to store only the $E_{fix}$, detectors, $E_{bin}$ arrays and the mapping of the pixels to the requisite indices in these arrays (already stored as det_id, en_id, run_id) to compute qx, qy, qz, E on-the-fly (these expansions are available in e.g. calculate_qw_pixels2). This instead results in $\mathcal{O}\left(n_{\text{run}} (n_{\text{Efix}} + n_{\text{det}} + n_{\text{Ebin}} ) \right)$ (given unique detector compression, this may in fact be smaller than this; signal and error and *_id will still need to be stored as $\mathcal{O}\left(n_{\text{run}} n_{\text{Efix}} n_{\text{det}} n_{\text{Ebin}} \right)$ )

Not only this, but it should be possible to eliminate zero-counts (possibly following a low-pass filter stage) from the storage by dropping these detector results from the mapping. This could be done by:

  • splitting the detectors into "signalling"-"non-signalling" and dumping as separate blocks
  • by deliberately ordering these to the bottom of the data arrays

It would then be possible to drop the signal & error components for these pixels and assume any "empty" detector signal is 0. This will inherently further reduce the volume of stored data.

Initial estimates for standard data suggest that these could result in savings ~5x with respect to current storage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants