Skip to content

nbdev_test parallel() process re-use results in hard to debug behaviour #1287

Open
@xl0

Description

@xl0

Run nbdev_test --n_workers 1 --file_re "0[07].*" --do_print in fastai repo (after installing the dev dependencies)
Error I'm getting:

xl0@vespa:~/work/code/fastai$ nbdev_test --n_workers 1  --file_re "0[70].*" --do_print
Starting /ssd/xl0/work/code/fastai/nbs/00_torch_core.ipynb
- Completed /ssd/xl0/work/code/fastai/nbs/00_torch_core.ipynb
Starting /ssd/xl0/work/code/fastai/nbs/07_vision.core.ipynb
AttributeError in /ssd/xl0/work/code/fastai/nbs/07_vision.core.ipynb:
===========================================================================

While Executing Cell #41:
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[1], line 1
----> 1 test_fig_exists(ax)

File ~/mambaforge/envs/test/lib/python3.9/site-packages/fastcore/test.py:101, in test_fig_exists(ax)
     99 def test_fig_exists(ax):
    100     "Test there is a figure displayed in `ax`"
--> 101     assert ax and len(ax.figure.canvas.tostring_argb())

File ~/mambaforge/envs/test/lib/python3.9/site-packages/matplotlib/backends/backend_agg.py:438, in FigureCanvasAgg.tostring_argb(self)
    431 def tostring_argb(self):
    432     """
    433     Get the image as ARGB `bytes`.
    434 
    435     `draw` must be called at least once before this function will work and
    436     to update the renderer for any subsequent changes to the Figure.
    437     """
--> 438     return self.renderer.tostring_argb()

AttributeError: 'FigureCanvasAgg' object has no attribute 'renderer'

- Completed /ssd/xl0/work/code/fastai/nbs/07_vision.core.ipynb

nbdev Tests Failed On The Following Notebooks:
==================================================
        07_vision.core.ipynb
(test) xl0@vespa:~/work/code/fastai$ 

--file_re "0[07].*" meaning, test notebooks 00 and 07.
nbdev_test will use fastcore ProcessPoolExecutor with 1 worker, which means the worker process will be re-used between notebooks.

I don't completely understand the source of the error - both notebooks 00 and 07 make use of test_fig_exists(ax), but only one of them fails, but pretty certain it comes from some internal state being shared between notebooks by the re-used process: No failure occurs if notebooks are run separately, or if --n_workers 2 and each notebook gets a fresh worker process.

While the issue could possibly be fixed in fastai, I believe that the behaviour is itself a bug - a test framework should not leak state between unrelated notebooks. This behaviour results in unexpected and frustrating failures. I earlier ran into a similar issue using JAX with nbdev, when two notebooks would initialize JAX with different parameters, but since the process is being re-used, the second initialization in the same process is ignored.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions