Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
related to ENH: use threshold to avoid plotting DXT heatmaps in logs with DXT and HEATMAP data #729 and ENH, CI:
e3sm_io_heatmap_and_dxt.darshan
memory usage causes CI error #692draft infrastructure for skipping the processing
of DXT data (for Python html report generation) above a certain module compressed size
threshold, in cases where runtime
HEATMAP
data isavailable
note that for the vast majority of log files that
have been provided/problematic in this regard, including
the large ones from NERSC, this is of no help, because there
is no
HEATMAP
data to fall back onfor a case where this does help, on this branch:
time python -m darshan summary e3sm_io_heatmap_and_dxt.darshan
real 0m12.415s
vs.
main
:real 0m47.470s
so, that's not a bad improvement, but there are still many
things to decide/do here:
the sizes of each DXT module together vs. having per-DXT
module thresholds
if
HEATMAP
is not available (otherwise, all the sample NERSClogs will use > 100 GB memory and be unusable with current report
generation machinery)
threshold is reached to disable DXT parsing
(if i.e., the user is working on a high memory node and really
wants to see DXT results)