Skip to content

Recommendations for including data in WASM notebook #3194

Closed
@gabrielgrant

Description

@gabrielgrant

Documentation is

  • Missing
  • Outdated
  • Confusing
  • Not sure?

Explain in Detail

The most common pattern of my notebooks is to read a file (CSV or JSON) into a pandas DF and then do some manipulations. When exporting WASM this fails with FileNotFound errors

Traceback (most recent call last):
  File "/lib/python3.12/site-packages/marimo/_runtime/executor.py", line 157, in execute_cell
    exec(cell.body, glbls)
  Cell marimo:///home/gabriel/repos/rxfood/data-notebooks/gabriel/nutrient_estimation_evals/nutrient_generation_manual_eval.py#cell=cell-3, line 2, in <module>
    mi_df = pd.read_json(e2e_comparison_dir + 'mi.json', orient="records", lines=True)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lib/python3.12/site-packages/pandas/io/json/_json.py", line 791, in read_json
    json_reader = JsonReader(
                  ^^^^^^^^^^^
  File "/lib/python3.12/site-packages/pandas/io/json/_json.py", line 904, in __init__
    data = self._get_data_from_filepath(filepath_or_buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lib/python3.12/site-packages/pandas/io/json/_json.py", line 960, in _get_data_from_filepath
    raise FileNotFoundError(f"File {filepath_or_buffer} does not exist")
FileNotFoundError: File mi.json does not exist

Not so surprising that marimo has no concept of what external files I'm depending on. Is there a recommended path for how to include data along with the notebook when publishing? This seems like a pretty common need, but not seeing any mention of this in the export docs https://docs.marimo.io/guides/exporting.html#export-to-wasm-powered-html

Your Suggestion for Changes

It would be amazing if this just worked out of the box (by watching for open files and auto-including them as deps, i guess?), but just having some recommended way to do this (even with extra work) would be nice. Maybe there's something I should be doing with Marimo's built-in caching, for instance?

Did try just wrapping with a simple persistent cache:

with mo.persistent_cache(name="my_cache"):
    mi_df = pd.read_json('mi.json', orient="records", lines=True)

But seems that the cache doesn't get included in the WASM export assets, so it still fails trying to open the file (same error as with no cache):

marimo._save.cache.CacheException: Failure during save.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/lib/python3.12/site-packages/marimo/_runtime/executor.py", line 157, in execute_cell
    exec(cell.body, glbls)
  Cell marimo:///home/gabriel/repos/rxfood/data-notebooks/gabriel/nutrient_estimation_evals/nutrient_generation_manual_eval.py#cell=cell-3, line 1, in <module>
    with mo.persistent_cache(name="e2e_comparison_cache"):
  File "/lib/python3.12/site-packages/marimo/_save/save.py", line 500, in __exit__
    raise instance from CacheException("Failure during save.")
  Cell marimo:///home/gabriel/repos/rxfood/data-notebooks/gabriel/nutrient_estimation_evals/nutrient_generation_manual_eval.py#cell=cell-3, line 3, in <module>
    mi_df = pd.read_json(e2e_comparison_dir + 'mi.json', orient="records", lines=True)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lib/python3.12/site-packages/pandas/io/json/_json.py", line 791, in read_json
    json_reader = JsonReader(
                  ^^^^^^^^^^^
  File "/lib/python3.12/site-packages/pandas/io/json/_json.py", line 904, in __init__
    data = self._get_data_from_filepath(filepath_or_buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lib/python3.12/site-packages/pandas/io/json/_json.py", line 960, in _get_data_from_filepath
    raise FileNotFoundError(f"File {filepath_or_buffer} does not exist")
FileNotFoundError: File mi.json does not exist

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions