Description
Documentation is
- Missing
- Outdated
- Confusing
- Not sure?
Explain in Detail
The most common pattern of my notebooks is to read a file (CSV or JSON) into a pandas DF and then do some manipulations. When exporting WASM this fails with FileNotFound errors
Traceback (most recent call last):
File "/lib/python3.12/site-packages/marimo/_runtime/executor.py", line 157, in execute_cell
exec(cell.body, glbls)
Cell marimo:///home/gabriel/repos/rxfood/data-notebooks/gabriel/nutrient_estimation_evals/nutrient_generation_manual_eval.py#cell=cell-3, line 2, in <module>
mi_df = pd.read_json(e2e_comparison_dir + 'mi.json', orient="records", lines=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.12/site-packages/pandas/io/json/_json.py", line 791, in read_json
json_reader = JsonReader(
^^^^^^^^^^^
File "/lib/python3.12/site-packages/pandas/io/json/_json.py", line 904, in __init__
data = self._get_data_from_filepath(filepath_or_buffer)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.12/site-packages/pandas/io/json/_json.py", line 960, in _get_data_from_filepath
raise FileNotFoundError(f"File {filepath_or_buffer} does not exist")
FileNotFoundError: File mi.json does not exist
Not so surprising that marimo has no concept of what external files I'm depending on. Is there a recommended path for how to include data along with the notebook when publishing? This seems like a pretty common need, but not seeing any mention of this in the export docs https://docs.marimo.io/guides/exporting.html#export-to-wasm-powered-html
Your Suggestion for Changes
It would be amazing if this just worked out of the box (by watching for open files and auto-including them as deps, i guess?), but just having some recommended way to do this (even with extra work) would be nice. Maybe there's something I should be doing with Marimo's built-in caching, for instance?
Did try just wrapping with a simple persistent cache:
with mo.persistent_cache(name="my_cache"):
mi_df = pd.read_json('mi.json', orient="records", lines=True)
But seems that the cache doesn't get included in the WASM export assets, so it still fails trying to open the file (same error as with no cache):
marimo._save.cache.CacheException: Failure during save.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/lib/python3.12/site-packages/marimo/_runtime/executor.py", line 157, in execute_cell
exec(cell.body, glbls)
Cell marimo:///home/gabriel/repos/rxfood/data-notebooks/gabriel/nutrient_estimation_evals/nutrient_generation_manual_eval.py#cell=cell-3, line 1, in <module>
with mo.persistent_cache(name="e2e_comparison_cache"):
File "/lib/python3.12/site-packages/marimo/_save/save.py", line 500, in __exit__
raise instance from CacheException("Failure during save.")
Cell marimo:///home/gabriel/repos/rxfood/data-notebooks/gabriel/nutrient_estimation_evals/nutrient_generation_manual_eval.py#cell=cell-3, line 3, in <module>
mi_df = pd.read_json(e2e_comparison_dir + 'mi.json', orient="records", lines=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.12/site-packages/pandas/io/json/_json.py", line 791, in read_json
json_reader = JsonReader(
^^^^^^^^^^^
File "/lib/python3.12/site-packages/pandas/io/json/_json.py", line 904, in __init__
data = self._get_data_from_filepath(filepath_or_buffer)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.12/site-packages/pandas/io/json/_json.py", line 960, in _get_data_from_filepath
raise FileNotFoundError(f"File {filepath_or_buffer} does not exist")
FileNotFoundError: File mi.json does not exist