Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nbval and ipyparallel #119

Open
FabioLuporini opened this issue Jun 3, 2019 · 8 comments
Open

nbval and ipyparallel #119

FabioLuporini opened this issue Jun 3, 2019 · 8 comments

Comments

@FabioLuporini
Copy link

FabioLuporini commented Jun 3, 2019

Hi, I wonder whether it's possible to use nbval for a jupyter notebook that exploits ipyparallel in combination with MPI (mpi4py).

This is the notebook I'm talking about. It's nothing special -- you can stop reading at cell 2

  1. I'm seeing failures when running with nbval, not sure if the fault is mine or what, still to be investigated properly (the error trace is here, starting at around line 300)... so for now you might ignore this...I think... but...
  2. how can/should I use things like #NBVAL_IGNORE_OUTPUT in combination with ipyparallel's magic %%px ? both are supposed to appear at the very top of a cell

Thanks!

@takluyver
Copy link
Member

I'm not so sure on the parallel stuff, but the marker comments for nbval can be anywhere in the cell. You can also use cell tags instead of comments: https://nbviewer.jupyter.org/github/computationalmodelling/nbval/blob/0.9.1/docs/source/index.ipynb#Using-tags-instead-of-comments

@FabioLuporini
Copy link
Author

thanks. I'll try this and will keep digging. Gimme another couple of days before closing the issue alright? Maybe I can report more

@FabioLuporini
Copy link
Author

I'm closing this for now. Thanks!

@FabioLuporini
Copy link
Author

Sorry, I feel like I have to reopen this issue because I don't really know how to fix it

I keep seeing this kind of error from random cells:

Input:
%%px --block --group-outputs=engine‌
u.data[0, 1:-1, 1:-1] = 1.
u.data

Traceback:‌
Unexpected output fields from running code: {'stdout'}‌

Sometimes our CI is green, sometimes it's red due to one random cell failing as per above
THe traceback is always the same. This happens even in cells which are not supposed to print anything to stdout (e.g.., cells only changing entries in a dictionary)

When does nbval exactly check the output of a cell? is it possible that nbval performs the output check when one process has returned, while the others have not yet? or something along these lines ? I'm really at a loss. At this point, any sort of information would be greatly appreciated.

@takluyver
Copy link
Member

nbval checks the output when the cell has finished running. This usually means that the execute_reply message has been sent on the shell channel and an idle status message has been sent on the iopub channel. It doesn't know anything specific about ipyparallel - it sends that cell to the kernel, where ipyparallel processes the %%px cell magic and does whatever it needs to do with that.

I can't see any obvious reason why that cell would behave randomly. But I'm not super familiar with ipyparallel.

@FabioLuporini
Copy link
Author

FabioLuporini commented Jun 18, 2019

I'm still investigating the issue.

After forking ipyparallel and nbval, I found out that the (randomly) failing cell is getting an unexpected message of type stream from the ipython kernel.

These are the messages received on iopub while processing the failing cell ; the third one is the "unexpected" one.

I have no idea why sometimes this bug appears and sometimes not.

I should add that it seems that always the same cells cause the failure (in common they have that some custom __setitem__ is being executed, see 2nd message in the link above (note that u.data is not a numpy array, but rather a custom subclass))

Also, I can't reproduce this on my local machine (which makes debugging horribly painful); this only appears on our CI system (azure pipelines). I don't know if there's a timing issue somehow

EDIT: I wonder whether this might be relevant...

@takluyver
Copy link
Member

That issue does look potentially relevant. "got unknown result" is a message from ipyparallel when it gets a reply to a message ID which is not in self.outstanding:

https://github.com/ipython/ipyparallel/blob/6.2.4/ipyparallel/client/client.py#L766

@FabioLuporini
Copy link
Author

yes I saw that. Just can't figure out why it sometimes appears, and sometimes not

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants