-
Notifications
You must be signed in to change notification settings - Fork 2
Restructure workflow for streaming #176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
RunType, | ||
SampleRun, | ||
ScaledCountsDspacing, | ||
UncertaintyBroadcastMode, | ||
VanadiumRun, | ||
WavelengthMonitor, | ||
) | ||
|
||
|
||
def normalize_by_monitor_histogram( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this now be called scale_by_monitor_histogram
? And instead of RunNormalization
, should it be Scaling
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe. But maybe we should wait until we have (separately from this) had a wider look at naming? I tried to not go beyond the scope of this restructure, where sensible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any actual changes in this file or did you only reformat?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The domain types have changed in some places.
src/ess/powder/correction.py
Outdated
det_coord = ( | ||
detector.coords[dim] if dim in detector.coords else detector.bins.coords[dim] | ||
) | ||
if dim not in detector.coords: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you change this to always use the event coord even if there is a bin coord for binned data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/ess/powder/grouping.py
Outdated
def _reconstruct_wavelength( | ||
dspacing_bins: DspacingBins, two_theta_bins: TwoThetaBins | ||
) -> sc.Variable: | ||
dspacing = sc.midpoints(dspacing_bins) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try keeping this as bin-edges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@dataclass(frozen=True) | ||
class KeepEvents(Generic[RunType]): | ||
""" | ||
Flag indicating whether the workflow should keep all events when focussing data. | ||
|
||
If False, data will be histogrammed when focussing. | ||
""" | ||
|
||
value: bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the benefit of using a dataclass here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool
cannot be subclassed, so I got complaints when trying to use sciline.Scope
.
Co-authored-by: Jan-Lukas Wynen <jan-lukas.wynen@ess.eu>
Co-authored-by: Jan-Lukas Wynen <jan-lukas.wynen@ess.eu>
Co-authored-by: Jan-Lukas Wynen <jan-lukas.wynen@ess.eu>
edge_dims = [ | ||
dim | ||
for dim, size in counts.sizes.items() | ||
if size + 1 == det_coord.sizes[dim] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumes that det_coord
has the same dims as counts
. Is this guaranteed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
counts
was made from detector
so det_coord
should have either the same length or be one longer, which is what we check here, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean the number of dimensions. This code iterates over the sizes of counts
and assumes that each dim
exists in det_coord
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In practice yes, since each bin has a different wavelength, i.e., this is not a 1-D coord.
This change is necessary to allow for accumulation of detector and monitor counts in stream processing, while avoiding excessive or unbounded use of memory.
The main change is that we normalize to monitor after reducing the "pixel" dimension. Since this normalization may require wavelength, we do so in a 2-D$(2\theta,d)$ grid, with a unique wavelength for each grid cell.
Detailed changes:
Domain Type Changes
Renamed Types:
DataWithScatteringCoordinates
→CountsWavelength
DspacingData
→CountsDspacing
NormalizedRunData
→ScaledCountsDspacing
New Types:
ReducedCountsDspacing
: Added to represent reduced counts in d-spacing after partial reduction over pixel dimensionKeepEvents
: New dataclass to control whether events are kept during data focusing or immediately histogrammedFunction Changes
Modified Functions:
normalize_by_monitor_histogram
: Now takesReducedCountsDspacing
instead ofDataWithScatteringCoordinates
and returnsScaledCountsDspacing
normalize_by_monitor_integrated
: Similar type signature changes as abovenormalize_by_proton_charge
: Similar type signature changes as aboveapply_masks
: Now takesCountsWavelength
instead ofNormalizedRunData
New or Significantly Modified Functions:
focus_data_dspacing_and_two_theta
: Expanded to handle event preservation based onKeepEvents
flagintegrate_two_theta
: New function to integrate over the two-theta dimensiongroup_two_theta
: New function for grouping data by two-theta binsWorkflow Changes
Event Preservation Control: Added the ability to control whether events are kept during data focusing through the
KeepEvents
parameter. Vanadium events are not kept by default, the others are.