Upload tutorials output to a csv in 'artifacts' branch#1695
Upload tutorials output to a csv in 'artifacts' branch#1695esantorella wants to merge 22 commits intomainfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1695 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 169 169
Lines 14537 14537
=========================================
Hits 14537 14537 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
|
@esantorella has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@esantorella has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
saitcakmak
left a comment
There was a problem hiding this comment.
Thanks for putting this together. This lgtm as is, though I feel like it would be much easier to sift through the outputs if we were to append them to the same file rather than creating a new file each time. Do you think we could read the last file with pandas, append the new df and save that instead?
Yeah I think that's a good idea -- let me see if I can get it working. By the way, I'm working on a notebook to make it easier to visualize the data and provide examples for how to work with it: https://github.com/esantorella/botorch/blob/tutorials_analytics/notebooks/tutorials_performance_tracking.ipynb The problem with the notebook is that with this setup, there's no branch that both contains the csv data and has up-to-date Python functionality. The notebook would be a lot easier to run if there were just one CSV and that could be grabbed from GitHub, so I like your idea a lot. |
|
On second thought, I don't love the job of having a simple "concatenate csvs" line since that makes it hard to change the format of data in the future. I'm going to land this without the concatenation step, then in a subsequent PR have Pandas deal with merging data. |
|
@esantorella merged this pull request in 6854751. |
Summary: ## Motivation #1695 and #1703 introduced logging of tutorials runtime and memory usage, and visualizing the results in [a notebook stored in the artifacts branch](https://github.com/pytorch/botorch/blob/artifacts/notebooks/tutorials_performance_tracking.ipynb). This information been occasionally helpful for checking whether a tutorials timeout stemmed from a pervasive slowdown, a method-specific issue, or random chance. However, it has not been used often, increases the size of the repository, and now has stopped updating and generated [a failure in the nightly cron](https://github.com/pytorch/botorch/actions/runs/8704134925/job/23882395249#step:12:132). ### Have you read the [Contributing Guidelines on pull requests](https://github.com/pytorch/botorch/blob/main/CONTRIBUTING.md#pull-requests)? Yes Pull Request resolved: #2298 Test Plan: [x] Run tutorials locally [ ] Make sure tutorials action passes on PR [ ] Nightly cron ## Related PRs #1695 , #1703 Reviewed By: saitcakmak Differential Revision: D56192232 Pulled By: esantorella fbshipit-source-id: 02b0c1c3702929ebbea2e2cb90e5669ac7040c44
Motivation
Writes the runtime and memory output we already produce to the 'artifacts' branch. The upload happens when the tutorials run on push and in the nightly cron. An example will show here after the nightly cron finishes running: https://github.com/pytorch/botorch/tree/artifacts/tutorial_performance_data Currently there are a couple test files in there, which I plan to clean up.
Test Plan
[x] Check that the upload works "with smoke test" by setting it to run on a push to this branch: https://github.com/pytorch/botorch/actions/runs/4226339782
[x] Check that it runs in the nightly cron and that output looks as expected: https://github.com/pytorch/botorch/blob/artifacts/tutorial_performance_data/standard_590b6edd_2023-02-20%2019%3A09%3A44.557008.csv