Skip to content

Conversation

andli28
Copy link
Contributor

@andli28 andli28 commented Sep 6, 2025

Overview

Closes #168

Description of changes

This commit adds a 'Show raw data' checkbox to the Explore... widgets in the seismometer package.

When the checkbox is enabled, the underlying pandas.DataFrame used to produce the current visualization is displayed. The raw data output updates reactively when any widget controls (e.g., dropdowns, sliders, filters) change.

To achieve this, the following changes were made:

  • The UpdatePlotWidget in src/seismometer/controls/explore.py was updated to include the 'Show raw data' checkbox.
  • The ExplorationWidget in the same file was modified to handle the display of the raw data.
  • The plot functions in src/seismometer/api/plots.py and src/seismometer/api/explore.py were updated to return a tuple of (HTML, pd.DataFrame).
  • A new caching decorator disk_cached_html_and_df_segment was created to handle caching both the HTML and the DataFrame.
  • Tests in tests/controls/test_explore.py were updated to reflect these changes.

Author Checklist

  • Linting passes; run early with pre-commit hook.
  • Tests added for new code and issue being fixed.
  • Added type annotations and full numpy-style docstrings for new methods.
  • Draft your news fragment in new changelog/ISSUE.TYPE.rst files; see changelog/README.md.

@CLAassistant
Copy link

CLAassistant commented Sep 6, 2025

CLA assistant check
All committers have signed the CLA.

@andli28 andli28 marked this pull request as draft September 6, 2025 01:13
@andli28 andli28 marked this pull request as ready for review September 6, 2025 23:28
@andli28 andli28 force-pushed the feature/show-raw-data branch 2 times, most recently from 202ec68 to 9b69db1 Compare September 6, 2025 23:40
This commit adds a 'Show raw data' checkbox to the `Explore...` widgets in the seismometer package.

When the checkbox is enabled, the underlying pandas.DataFrame used to produce the current visualization is displayed. The raw data output updates reactively when any widget controls (e.g., dropdowns, sliders, filters) change.

To achieve this, the following changes were made:
- The `UpdatePlotWidget` in `src/seismometer/controls/explore.py` was updated to include the 'Show raw data' checkbox.
- The `ExplorationWidget` in the same file was modified to handle the display of the raw data.
- The plot functions in `src/seismometer/api/plots.py` and `src/seismometer/api/explore.py` were updated to return a tuple of (HTML, pd.DataFrame).
- The `@disk_cached_html_segment` decorator was removed from the modified plot functions to avoid caching issues with the new return type.
- Tests in `tests/controls/test_explore.py` were updated to reflect these changes.
@andli28 andli28 force-pushed the feature/show-raw-data branch from 9b69db1 to fa253f1 Compare September 6, 2025 23:42
@diehlbw
Copy link
Collaborator

diehlbw commented Sep 16, 2025

First, so sorry on the delay to review!!

I'm struggling a little to understand the goal of the MR / deficiency-gap that is being closed.
What is the scenario where you'd want to check that "raw code" box?

A couple of the questions I'm trying to resolve:

  • is raw data displaying the correct information?
    • If you look at ExploreCohortEvaluation, you'll actually see a large frame of various metrics per threshold. Is this the correct "raw" data?
  • is it in a useful form?
    • If you look at ExploreModelEvaluation in the example dataset, it attempts to display a 99340x39 dataframe. with most rows and columns being hidden. Did the displayed (potentially PHI) answer the original question or was it suppressed in the hidden rows and columns? A number of these may be untouched by the visual itself.
  • should this option be a default for all explore controls? or is more targeted usage (and perhaps easy templated extension) more appropriate?

Alternate paths depending on the needs:

  • I'm wondering if something more like the "show code" output is useful? The idea being to return the data object so that manipulation can be done
  • Would more tabular-focused controls be helpful? Expanding on ExploreAnalyticsTable (potentially broken in my naive rebuild of your change) and/or the Fairness analysis.
  • Is Add info/debug logging to track how predictions/events data are changed #149 helping close this gap with its debug log on transformation and filtering? could it be extended for your needs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add raw data checkbox option to Explore widgets to expose underlying data
3 participants