Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing structural similarity rather than string identity #174

Open
psychemedia opened this issue Oct 20, 2021 · 2 comments
Open

Testing structural similarity rather than string identity #174

psychemedia opened this issue Oct 20, 2021 · 2 comments

Comments

@psychemedia
Copy link

psychemedia commented Oct 20, 2021

In some cases, it may be that cells generate variable output of a consistent form, for example:

  • looping to producing N lines of printed out;
  • display a list containing M items;
  • displaying a dataframe (HTML table) with particular dimensions.

In such cases, it would be useful to be able to test for such structural similarity (nbval-check-linecount, nbval-check-list, nbval-check-table) even if the detail is different.

@takluyver
Copy link
Member

It's not really up to me, but my 2c is that this wouldn't be a good fit for nbval. I don't think the kind of comments & cell tags you specify in nbval are expressive enough to specify what you actually want. E.g. it's not at all obvious that 'check list' means 'check the list has exactly M items' rather than 'check that this produces a list'. And what counts as a list, anyway? HTML <ol> and <ul>? Plain text with - Bullet points? Latex?

Of course, you could make something where you write check code in comments:

# assert len(outputs.find('ul')[0]) == 7
list_days_of_week()

That might well be useful, but I'd say it's something quite different from nbval.

@psychemedia
Copy link
Author

I've been sketching some ideas here: https://github.com/ouseful-PR/nbval/tree/table-test

Currently, a crude attempt at a dataframe comparison, a crude attempt at checking the linecount of stdout, a couple of CLI flag hacks to try to work around %%timeit cells (I did wonder if hardwiring a regex template around that might be better), and a couple of extra tags to give a bit more variety in ignoring cell outputs from nbval perspective (eg NB-VARIABLE-OUTPUT) and more generally adding semantics (eg folium-map).

These are all just sketches around a particular use-case (a set of educational material notebooks), and hacked to work for those / that purpose rather than being tested for general production use (which is beyond me). There are no tests either (my tests are informal against particular notebooks I have available locally....)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants