Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test-net-bytes-per-incoming-chunk-overhead failing on Fedora 38 #48490

Closed
targos opened this issue Jun 18, 2023 · 21 comments · Fixed by #48811
Closed

test-net-bytes-per-incoming-chunk-overhead failing on Fedora 38 #48490

targos opened this issue Jun 18, 2023 · 21 comments · Fixed by #48811
Labels
flaky-test Issues and PRs related to the tests with unstable failures on the CI. net Issues and PRs related to the net subsystem. test Issues and PRs related to the tests.

Comments

@targos
Copy link
Member

targos commented Jun 18, 2023

$ out/Release/node --expose-gc /home/iojs/build/workspace/
node-test-commit-linux/test/pummel/test-net-bytes-per-incoming-chunk-overhead.js
node:assert:399
    throw err;
    ^

AssertionError [ERR_ASSERTION]: measured 4552.081408 bytes per chunk
    at process.<anonymous> (/home/iojs/build/workspace/node-test-commit-linux/test/pummel/test-net-bytes-per-incoming-chunk-overhead.js:49:3)
    at process.emit (node:events:523:35) {
  generatedMessage: false,
  code: 'ERR_ASSERTION',
  actual: false,
  expected: true,
  operator: '=='
}

Node.js v21.0.0-pre

$ uname -a
Linux test-digitalocean-fedora38-x64-1 6.3.8-200.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Jun 15 02:15:40 UTC 2023 x86_64 GNU/Linux

$ cat /etc/os-release
NAME="Fedora Linux"
VERSION="38 (Cloud Edition)"
ID=fedora
VERSION_ID=38
VERSION_CODENAME=""
PLATFORM_ID="platform:f38"
PRETTY_NAME="Fedora Linux 38 (Cloud Edition)"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:38"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f38/system-administrators-guide/"
SUPPORT_URL="https://ask.fedoraproject.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=38
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=38
SUPPORT_END=2024-05-14
VARIANT="Cloud Edition"
VARIANT_ID=cloud

$ gcc --version
gcc (GCC) 13.1.1 20230614 (Red Hat 13.1.1-4)

Refs: nodejs/build#3350 (comment)

@nodejs/net

@targos targos added net Issues and PRs related to the net subsystem. test Issues and PRs related to the tests. labels Jun 18, 2023
@ShogunPanda
Copy link
Contributor

@targos Do you mind checking this once #48464 lands?

@targos
Copy link
Member Author

targos commented Jun 19, 2023

@ShogunPanda I've applied the fix on top of main on the Fedora host and it still fails with the same error.

@ShogunPanda
Copy link
Contributor

I see. If you disable network family autoselection does it work?

@targos
Copy link
Member Author

targos commented Jun 20, 2023

It still doesn't work with --no-network-family-autoselection

@ShogunPanda
Copy link
Contributor

Ok, I see. At least I know it's unrelated to my changes.
Seems like the memory overhead has changed. Do we have a lead on why?

@lpinca lpinca added the flaky-test Issues and PRs related to the tests with unstable failures on the CI. label Jun 30, 2023
@lpinca
Copy link
Member

lpinca commented Jun 30, 2023

This also failed on macOS 11. See https://ci.nodejs.org/job/node-test-pull-request/52510/.

not ok 3596 pummel/test-net-bytes-per-incoming-chunk-overhead
  ---
  duration_ms: 73725.34500
  severity: crashed
  exitcode: -6
  stack: |-
    Assertion failed: (timeout != -1), function uv__io_poll, file kqueue.c, line 290.

@targos
Copy link
Member Author

targos commented Jun 30, 2023

Not the same error, though.

@lpinca
Copy link
Member

lpinca commented Jun 30, 2023

cc: @nodejs/libuv

@bnoordhuis
Copy link
Member

bnoordhuis commented Jun 30, 2023

Not a libuv issue. The test makes a shaky assumption about RSS:

global.gc();
const bytesPerChunk =
(process.memoryUsage.rss() - baseRSS) / receivedChunks.length;
// We should always have less than one page (usually ~ 4 kB) per chunk.
assert(bytesPerChunk < 650, `measured ${bytesPerChunk} bytes per chunk`);

@lpinca
Copy link
Member

lpinca commented Jun 30, 2023

@bnoordhuis I pinged libuv maintainers for this failed assertion. I think it is not related to the test but I guess it shouldn't happen.

Assertion failed: (timeout != -1), function uv__io_poll, file kqueue.c, line 290

@bnoordhuis
Copy link
Member

Ah, right. That should have been fixed by #48078 but I guess not? cc @trevnorris

@trevnorris
Copy link
Contributor

trevnorris commented Jul 11, 2023

Want to confirm, does this still happen with the recent update to libuv v1.46.0? And does it happen consistently?

@lpinca
Copy link
Member

lpinca commented Jul 11, 2023

@trevnorris The last time seems to be Jun 27, 2023 as per https://ci.nodejs.org/job/node-test-commit-osx/52949/. I'm not sure if libuv v1.46.0 was already merged.

@lpinca
Copy link
Member

lpinca commented Jul 11, 2023

It seems libuv@1.46.0 landed on main on Jul 3, 2023 so it did not happen again.

@trevnorris
Copy link
Contributor

@lpinca I'm not convinced it's been fully fixed. Did that crash only happen with test-net-bytes-per-incoming-chunk-overhead?

trevnorris added a commit to trevnorris/libuv that referenced this issue Jul 11, 2023
Match the implementation for linux.c to kqueue.c in the code around the
calls to kevent and epoll.

In linux.c the code was made more DRY by moving the nfds check up
(including a comment of why it's possible) and combining two if checks
into one.

In kqueue.c the EINTR check was moved to match linux.c and an assert
that was added in 42cc412 has been moved to be called directly after
the EINTR check. Since it should always be true regardless.

Ref: libuv#3893
Ref: nodejs/node#48490
@lpinca
Copy link
Member

lpinca commented Jul 12, 2023

Did that crash only happen with test-net-bytes-per-incoming-chunk-overhead ?

I don't know, I only saw it with that test.

@trevnorris
Copy link
Contributor

trevnorris commented Jul 12, 2023

@lpinca do you know the version of FreeBSD macOS that was running when it failed?

trevnorris added a commit to trevnorris/libuv that referenced this issue Jul 13, 2023
Match the implementation for linux.c to kqueue.c in the code around the
calls to kevent and epoll.

In linux.c the code was made more DRY by moving the nfds check up
(including a comment of why it's possible) and combining two if checks
into one.

In kqueue.c the EINTR check was moved to match linux.c and an assert
that was added in 42cc412 has been moved to be called directly after
the EINTR check. Since it should always be true regardless.

Ref: libuv#3893
Ref: nodejs/node#48490
@lpinca
Copy link
Member

lpinca commented Jul 13, 2023

It was macOS 11: https://ci.nodejs.org/job/node-test-commit-osx/52949/nodes=osx11-x64/

trevnorris added a commit to trevnorris/libuv that referenced this issue Jul 13, 2023
Match the implementation for linux.c to kqueue.c in the code around the
calls to kevent and epoll.

In linux.c the code was made more DRY by moving the nfds check up
(including a comment of why it's possible) and combining two if checks
into one.

In kqueue.c the EINTR check was moved to match linux.c and an assert
that was added in 42cc412 has been moved to be called directly after
the EINTR check. Since it should always be true regardless.

Ref: libuv#3893
Ref: nodejs/node#48490
trevnorris added a commit to trevnorris/libuv that referenced this issue Jul 13, 2023
Match the implementation for linux.c to kqueue.c in the code around the
calls to kevent and epoll.

In linux.c the code was made more DRY by moving the nfds check up
(including a comment of why it's possible) and combining two if checks
into one.

In kqueue.c the assert to check the timeout when nfds == 0 has been
moved to be called directly after the EINTR check. Since it should
always be true regardless.

Ref: libuv#3893
Ref: nodejs/node#48490
trevnorris added a commit to trevnorris/libuv that referenced this issue Jul 13, 2023
Match the implementation for linux.c to kqueue.c in the code around the
calls to kevent and epoll.

In linux.c the code was made more DRY by moving the nfds check up
(including a comment of why it's possible) and combining two if checks
into one.

In kqueue.c the assert to check the timeout when nfds == 0 has been
moved to be called directly after the EINTR check. Since it should
always be true regardless.

Ref: libuv#3893
Ref: nodejs/node#48490
trevnorris added a commit to trevnorris/libuv that referenced this issue Jul 13, 2023
Match the implementation for linux.c to kqueue.c in the code around the
calls to kevent and epoll.

In linux.c the code was made more DRY by moving the nfds check up
(including a comment of why it's possible) and combining two if checks
into one.

In kqueue.c the assert to check the timeout when nfds == 0 has been
moved to be called directly after the EINTR check. Since it should
always be true regardless.

Ref: libuv#3893
Ref: nodejs/node#48490
@targos
Copy link
Member Author

targos commented Jul 17, 2023

@bnoordhuis Would you agree to just delete this test? I don't know what else to do with it.

@bnoordhuis
Copy link
Member

Yes. Bad assumptions are bad.

targos added a commit to targos/node that referenced this issue Jul 17, 2023
The test's assumptions about RSS are no longer valid, at least with
Fedora 38.

Closes: nodejs#48490
@targos
Copy link
Member Author

targos commented Jul 17, 2023

#48811

nodejs-github-bot pushed a commit that referenced this issue Jul 20, 2023
The test's assumptions about RSS are no longer valid, at least with
Fedora 38.

Closes: #48490
PR-URL: #48811
Fixes: #48490
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
targos added a commit that referenced this issue Jul 23, 2023
The test's assumptions about RSS are no longer valid, at least with
Fedora 38.

Closes: #48490
PR-URL: #48811
Fixes: #48490
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
targos added a commit that referenced this issue Jul 23, 2023
The test's assumptions about RSS are no longer valid, at least with
Fedora 38.

Closes: #48490
PR-URL: #48811
Fixes: #48490
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
trevnorris added a commit to trevnorris/libuv that referenced this issue Aug 4, 2023
Match the implementation for linux.c to kqueue.c in the code around the
calls to kevent and epoll.

In linux.c the code was made more DRY by moving the nfds check up
(including a comment of why it's possible) and combining two if checks
into one.

In kqueue.c the assert to check the timeout when nfds == 0 has been
moved to be called directly after the EINTR check. Since it should
always be true regardless.

Ref: libuv#3893
Ref: nodejs/node#48490
trevnorris added a commit to trevnorris/libuv that referenced this issue Aug 4, 2023
Match the implementation for linux.c to kqueue.c in the code around the
calls to kevent and epoll.

In linux.c the code was made more DRY by moving the nfds check up
(including a comment of why it's possible) and combining two if checks
into one.

In kqueue.c the assert to check the timeout when nfds == 0 has been
moved to be called directly after the EINTR check. Since it should
always be true regardless.

Ref: libuv#3893
Ref: nodejs/node#48490
pluris pushed a commit to pluris/node that referenced this issue Aug 6, 2023
The test's assumptions about RSS are no longer valid, at least with
Fedora 38.

Closes: nodejs#48490
PR-URL: nodejs#48811
Fixes: nodejs#48490
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
pluris pushed a commit to pluris/node that referenced this issue Aug 7, 2023
The test's assumptions about RSS are no longer valid, at least with
Fedora 38.

Closes: nodejs#48490
PR-URL: nodejs#48811
Fixes: nodejs#48490
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
Ceres6 pushed a commit to Ceres6/node that referenced this issue Aug 14, 2023
The test's assumptions about RSS are no longer valid, at least with
Fedora 38.

Closes: nodejs#48490
PR-URL: nodejs#48811
Fixes: nodejs#48490
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
Ceres6 pushed a commit to Ceres6/node that referenced this issue Aug 14, 2023
The test's assumptions about RSS are no longer valid, at least with
Fedora 38.

Closes: nodejs#48490
PR-URL: nodejs#48811
Fixes: nodejs#48490
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
UlisesGascon pushed a commit to UlisesGascon/node that referenced this issue Aug 14, 2023
The test's assumptions about RSS are no longer valid, at least with
Fedora 38.

Closes: nodejs#48490
PR-URL: nodejs#48811
Fixes: nodejs#48490
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
RafaelGSS pushed a commit that referenced this issue Aug 15, 2023
The test's assumptions about RSS are no longer valid, at least with
Fedora 38.

Closes: #48490
PR-URL: #48811
Fixes: #48490
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky-test Issues and PRs related to the tests with unstable failures on the CI. net Issues and PRs related to the net subsystem. test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants