Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pytest extension hangs forever after initialization #113

Open
frgfm opened this issue Aug 22, 2024 · 4 comments
Open

Pytest extension hangs forever after initialization #113

frgfm opened this issue Aug 22, 2024 · 4 comments

Comments

@frgfm
Copy link

frgfm commented Aug 22, 2024

Hey there 👋

Thanks for the great work!
I recently tried to add Codspeed to a project of mine following the tutorial on the documentation. Unfortunately, even though I followed the tutorial, the job step times out.

Here is the PR frgfm/torch-cam#270 and the failed job I just canceled to avoid eating all my CI minutes https://github.com/frgfm/torch-cam/actions/runs/10514500036/job/29132464450?pr=270

Mostly, the edits that I did:

GitHub workflow

  benchmarks:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ubuntu-latest]
        python: [3.9]
    steps:
      - uses: actions/checkout@v4
        with:
          persist-credentials: false
      - uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python }}
          architecture: x64
      - name: Install dependencies
        run: |
          python -m pip install --upgrade uv
          uv pip install --system -e ".[test]"
      - name: Run benchmarks
        uses: CodSpeedHQ/action@v3
        with:
          token: ${{ secrets.CODSPEED_TOKEN }}
          run: pytest --codspeed tests/

and adding @pytest.mark.benchmark on a few tests that I run for coverage in another job.

Any hint on why this times out? 🙏

@frgfm
Copy link
Author

frgfm commented Aug 27, 2024

@art049 @adriencaccia maybe?

@adriencaccia
Copy link
Member

Hey @frgfm, can you try marking only a single test as a benchmark, the fastest one to execute preferably? That way we can make sure that it does work with a simple benchmark.

In CodSpeed, each benchmark is executed only once and the CPU behavior will be simulated, and that simulation is quite expensive, so long-running benchmarks can take a lot of time on the GitHub runner.
My guess is that there might be a heavy benchmark that takes a bit of time to execute.

If it works in a reasonable amount of time for a simple benchmark, my advice is then to try and:

  • Only benchmark a code path once. Since the measure is precise, there is no need to have multiple benchmarks measuring the same thing
  • Reduce the size of the data in benchmarks with a lot of computing. Same reason as above

@frgfm
Copy link
Author

frgfm commented Aug 29, 2024

Hey @adriencaccia,

I've just tried, narrowing it down on only a single test with the decorator and one of the fastest. So the problem is that this single test took 5min while my whole pytest suite for coverage took 1minute to run. Any suggestion on how to improve that?

5min would be my longest CI job, and the corresponding decorated test is not the most useful to benchmark. Have you seen people using Codspeed on PyTorch? (perhaps for some reasons, some of the dependencies aren't fairing well on your runners? 🤷‍♂️ )

PS: to be more specific, the CI step of the action takes 5min but the GH app reports that the single test takes 2s. So I imagine this is run multiple times but I don't understand how that bumps to 5min of execution

@adriencaccia
Copy link
Member

The 2s showed in the CodSpeed GitHub comment and on the CodSpeed UI corresponds to the execution speed of the benchmark, which is not the same as the time it took to run this benchmark with the CodSpeed instrumentation.

Each benchmark is run once with the instrumentation, which adds some overhead. Hence for "macro-benchmarks" (benchmarks that take ~1s or more) the execution time with the CodSpeed instrumentation can take several minutes.

If that test is not that useful to benchmark, I would recommend skipping it in favor of other smaller more relevant benchmarks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants