Skip to content

test: deflake test-buffer-large-size #57789

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jakecastelli
Copy link
Member

@jakecastelli jakecastelli commented Apr 8, 2025

The test has failed 25+ times on 7th of April and 24 times 8th of April in our CI.

This PR attempts to spread the tests into multiple file to reduce the flakiness.

@nodejs-github-bot nodejs-github-bot added needs-ci PRs that need a full CI run. test Issues and PRs related to the tests. labels Apr 8, 2025
@jakecastelli
Copy link
Member Author

queued a stress test - https://ci.nodejs.org/job/node-stress-single-test/557/

@lpinca
Copy link
Member

lpinca commented Apr 8, 2025

I'm not sure I understand the difference. Why using a for loop reduces the flakiness?

@jakecastelli
Copy link
Member Author

Currently it is only one test that attempts to allocate 8GB of memory, the purpose of the for loop is to make them into 4 separate tests. I think it could have 2 benefits:

  1. we could be able to to know which one timed out
  2. potentially give more chances to gc

@jakecastelli
Copy link
Member Author

jakecastelli commented Apr 8, 2025

Actually - now I see my stress tests start to fail on ubuntu2204-arm64 with the same timeout, I think you might be right @lpinca 🤔 it wouldn't do anything useful, do you think this also relates to the deadlock? Maybe run a major gc is a better way to deflake the test 😅 wdyt?

@lpinca
Copy link
Member

lpinca commented Apr 8, 2025

Do you think this also relates to the deadlock?

Yes, probably.

@jakecastelli jakecastelli force-pushed the deflake-test-buffer-large-size branch from 58ebe5a to 5fae384 Compare April 8, 2025 13:43
@jakecastelli
Copy link
Member Author

I've attempted to add major gc after each allocation and queued another stress test - https://ci.nodejs.org/job/node-stress-single-test/558/. This test seems having a way too high failure rate.

@RaisinTen
Copy link
Member

Might be worth checking if distributing each of these into separate test files improves things

@nodejs-github-bot
Copy link
Collaborator

@jakecastelli
Copy link
Member Author

The stress tests seem ok, the CI failures relate to other flaky tests. Are we happy to do major gc or better off separate them into different files?

@jakecastelli jakecastelli force-pushed the deflake-test-buffer-large-size branch from 5fae384 to 4e19317 Compare April 9, 2025 12:12
@jakecastelli
Copy link
Member Author

Any further actionable item should I take?

  • Continue with landing this PR
  • Separate the tests into multiple files

Happy to take either path 👍

@lpinca
Copy link
Member

lpinca commented Apr 9, 2025

I would try to separate into multiple files without manually triggering GC and see what happens. Manually calling the GC is not a fix.

@jakecastelli jakecastelli force-pushed the deflake-test-buffer-large-size branch from 4e19317 to 920046f Compare April 9, 2025 12:34
@jakecastelli jakecastelli force-pushed the deflake-test-buffer-large-size branch from 920046f to e6b2341 Compare April 9, 2025 12:38
@jakecastelli jakecastelli marked this pull request as ready for review April 9, 2025 12:46
@jakecastelli
Copy link
Member Author

what are the practical ways to verify if things've improved?

  • have a few CI runs
  • 1000 stress tests

Any other suggestions?

Copy link
Member

@RaisinTen RaisinTen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Added some optional suggestions to simplify the tests

Copy link

codecov bot commented Apr 9, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.22%. Comparing base (1540fc6) to head (c76d93b).
Report is 242 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #57789      +/-   ##
==========================================
- Coverage   90.23%   90.22%   -0.02%     
==========================================
  Files         630      630              
  Lines      185288   185518     +230     
  Branches    36344    36380      +36     
==========================================
+ Hits       167203   167387     +184     
- Misses      11006    11027      +21     
- Partials     7079     7104      +25     

see 27 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@lpinca lpinca added the request-ci Add this label to start a Jenkins CI on a PR. label Apr 9, 2025
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Apr 9, 2025
@nodejs-github-bot
Copy link
Collaborator

Copy link
Member

@RaisinTen RaisinTen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@jakecastelli jakecastelli added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Apr 10, 2025
@jakecastelli jakecastelli added the commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. label Apr 10, 2025
@jakecastelli jakecastelli added the commit-queue Add this label to land a pull request using GitHub Actions. label Apr 11, 2025
@nodejs-github-bot nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Apr 11, 2025
@nodejs-github-bot nodejs-github-bot merged commit 795dd8e into nodejs:main Apr 11, 2025
67 checks passed
@nodejs-github-bot
Copy link
Collaborator

Landed in 795dd8e

JonasBa pushed a commit to JonasBa/node that referenced this pull request Apr 11, 2025
PR-URL: nodejs#57789
Reviewed-By: Darshan Sen <raisinten@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
RafaelGSS pushed a commit that referenced this pull request May 1, 2025
PR-URL: #57789
Reviewed-By: Darshan Sen <raisinten@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
RafaelGSS pushed a commit that referenced this pull request May 2, 2025
PR-URL: #57789
Reviewed-By: Darshan Sen <raisinten@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
aduh95 pushed a commit that referenced this pull request May 6, 2025
PR-URL: #57789
Reviewed-By: Darshan Sen <raisinten@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@aduh95 aduh95 added the backport-requested-v22.x PRs awaiting manual backport to the v22.x-staging branch. label May 6, 2025
@aduh95
Copy link
Contributor

aduh95 commented May 6, 2025

Tests are failing on GHA with this change (https://github.com/nodejs/node/actions/runs/14865929007/job/41742567775), we'd need a manual backport PR if we want to port this change.

@jakecastelli
Copy link
Member Author

I will look into the backport

@jakecastelli
Copy link
Member Author

Hi @aduh95 I took a look and realised this PR was fixing the flaky test introduced in #51821 which 51821 is a semver-major PRs that contain breaking changes and should be released in the next major version. so, I think I should've added don't land on labels

@jakecastelli jakecastelli added dont-land-on-v20.x PRs that should not land on the v20.x-staging branch and should not be released in v20.x. dont-land-on-v22.x PRs that should not land on the v22.x-staging branch and should not be released in v22.x. dont-land-on-v23.x PRs that should not land on the v23.x-staging branch and should not be released in v23.x. and removed backport-requested-v22.x PRs awaiting manual backport to the v22.x-staging branch. labels May 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. dont-land-on-v20.x PRs that should not land on the v20.x-staging branch and should not be released in v20.x. dont-land-on-v22.x PRs that should not land on the v22.x-staging branch and should not be released in v22.x. dont-land-on-v23.x PRs that should not land on the v23.x-staging branch and should not be released in v23.x. needs-ci PRs that need a full CI run. test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants