WIP, ENH: DXT plot threshold #802

tylerjereddy · 2022-09-03T21:58:10Z

related to ENH: use threshold to avoid plotting DXT heatmaps in logs with DXT and HEATMAP data #729 and ENH, CI: e3sm_io_heatmap_and_dxt.darshan memory usage causes CI error #692
draft infrastructure for skipping the processing
of DXT data (for Python html report generation) above a certain module compressed size
threshold, in cases where runtime HEATMAP data is
available
note that for the vast majority of log files that
have been provided/problematic in this regard, including
the large ones from NERSC, this is of no help, because there
is no HEATMAP data to fall back on
for a case where this does help, on this branch:
time python -m darshan summary e3sm_io_heatmap_and_dxt.darshan
real 0m12.415s
vs. main: real 0m47.470s
so, that's not a bad improvement, but there are still many
things to decide/do here:

test the size threshold empirically with more appropriate logs
decide if we want to use the current approach of summing
the sizes of each DXT module together vs. having per-DXT
module thresholds
decide if we'd also want a way to disable DXT handling even
if HEATMAP is not available (otherwise, all the sample NERSC
logs will use > 100 GB memory and be unusable with current report
generation machinery)
add a warning mechanism/message somewhere on the report when the
threshold is reached to disable DXT parsing
add a command line argument to force an override of the disable
(if i.e., the user is working on a high memory node and really
wants to see DXT results)
add regression tests for the new machinery

* related to darshan-hpcgh-729 and darshan-hpcgh-692 * draft infrastructure for skipping the processing of DXT data above a certain module compressed size threshold, in cases where runtime `HEATMAP` data is available * note that for the vast majority of log files that have been provided/problematic in this regard, including the large ones from NERSC, this is of no help, because there is no `HEATMAP` data to fall back on * for a case where this does help, on this branch: `time python -m darshan summary e3sm_io_heatmap_and_dxt.darshan` `real 0m12.415s vs. `main`: `real 0m47.470s` * so, that's not a bad improvement, but there are still many things to decide/do here: - [ ] test the size threshold empirically with more appropriate logs - [ ] decide if we want to use the current approach of summing the sizes of each DXT module together vs. having per-DXT module thresholds - [ ] decide if we'd also want a way to disable DXT handling even if `HEATMAP` is not available (otherwise, all the sample NERSC logs will use > 100 GB memory and be unusable with current report generation machinery) - [ ] add a warning mechanism/message somewhere on the report when the threshold is reached to disable DXT parsing - [ ] add a command line argument to force an override of the disable (if i.e., the user is working on a high memory node and really wants to see DXT results) - [ ] add regression tests for the new machinery

tylerjereddy · 2022-09-07T19:21:45Z

Feedback from meeting:

Phil is in favor of checking the threshold on a per-module basis rather than summing
Phil is also in favor of not displaying heatmaps when the threshold is exceeded even if runtime HEATMAP is absent, but we should have an option to force generation of heatmaps at your own peril

tylerjereddy added enhancement New feature or request pydarshan labels Sep 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP, ENH: DXT plot threshold #802

WIP, ENH: DXT plot threshold #802

tylerjereddy commented Sep 3, 2022

tylerjereddy commented Sep 7, 2022

WIP, ENH: DXT plot threshold #802

Are you sure you want to change the base?

WIP, ENH: DXT plot threshold #802

Conversation

tylerjereddy commented Sep 3, 2022

tylerjereddy commented Sep 7, 2022