Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test on MacOS and remove progress bar from example #159

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

Yoshanuikabundi
Copy link
Contributor

@Yoshanuikabundi Yoshanuikabundi commented Nov 12, 2024

Fixes #133

Changes made in this Pull Request:

  • Adds MacOS back to examples testing matrix
  • Reduces amount of output sent during fit

The timeout that's being triggered is not the cell timeout (default about 30 minutes), but a hard coded one for the notebook kernel to transmit the cell's output to nbval. My hypothesis is that the fit takes serious time and the progress bar is constantly sending updates to the output, producing more output than can be processed in 5 seconds. Platform differences may reflect that the MacOS runners take more time (and therefore produce more progress bar updates), or something weird to do with how outputs are transmitted.

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

@Yoshanuikabundi
Copy link
Contributor Author

Oops didn't mean to reformat the workflow yaml :/ will fix once I know if this worked

@codecov-commenter
Copy link

codecov-commenter commented Nov 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.23%. Comparing base (98dd835) to head (8870d78).

Additional details and impacted files


fail-fast: false
matrix:
os: [macOS-12, ubuntu-latest]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

macos-12 is being removed next month 😞

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is macos-latest appropriate?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think so, it's no longer intel but AmberTools doesn't work on macOS-13. Thanks so much for looking at this!

@Yoshanuikabundi
Copy link
Contributor Author

Hey look it seems to have worked! I've switched to macos-latest as Matt advises macos-12 is being removed, but the examples CI had already passed so I think I've correctly diagnosed and fixed the issue.

@Yoshanuikabundi Yoshanuikabundi marked this pull request as ready for review November 12, 2024 05:00
@Yoshanuikabundi
Copy link
Contributor Author

Ok looks like there are fun new errors to deal with. Not sure I have any insight on these ones, I've tried a few things and if this one doesn't work I'm out of ideas. If the current round of tests fail, I can revert to macos-12 and leave this for the future, or else I'm totally happy for you to hack on this branch :)

@lilyminium
Copy link
Collaborator

lilyminium commented Nov 12, 2024

Thanks so much for fixing the issue @Yoshanuikabundi, you are a wizard!!

RuntimeError: MPS backend out of memory (MPS allocated: 8.00 MB, other allocations: 16.00 KB, max allowed: 7.93 GB). Tried to allocate 256 bytes on shared pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).

What an exotic new error I've never seen before, even when Mac examples was running, haha. It looks like this? actions/runner-images#9918 So not easily fixable with until macos-15 is out, and with macos-13 out of the ambertools picture, downgrading to macos-12 seems the only way for CI to pass (until macos-12 disappears.)

Edit: unless there's a way to tell torch not to use mps?

matrix:
os: [macOS-latest, ubuntu-latest]
python-version: ["3.11", "3.12"]
pydantic-version: ["2"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be a bit of a nitpicker, but this could also go away (and below)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mac CI won't execute electric field output
4 participants