Skip to content

Conversation

addaleax
Copy link
Member

@addaleax addaleax commented Feb 8, 2020

a555be2 re-assigned async_.data to indicate success
or failure of the constructor. As the HandleWrap implementation
uses that field to access the HandleWrap instance from the
libuv handle, this introduced two issues:

  • It implicitly assumed that casting
    MessagePort*void*HandleWrap* would be valid.
  • It made the HandleWrap::OnClose() function fail with a
    nullptr dereference if the constructor did fail.

In particular, the second issue made
test/parallel/test-worker-cleanexit-with-moduleload.js crash at
least once in CI.

Since re-assigning async_.data isn’t actually necessary here
(only a leftover from earlier versions of that commit), fix this by
using a local variable instead, and add a CHECK that provides better
error messages for this type of issue in the future.

Refs: #31605


Example failure: https://ci.nodejs.org/job/node-test-commit-linux-containered/17944/nodes=ubuntu1804_sharedlibs_openssl111_x64/testReport/junit/(root)/test/parallel_test_worker_cleanexit_with_moduleload/

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines

a555be2 re-assigned `async_.data` to indicate success
or failure of the constructor. As the `HandleWrap` implementation
uses that field to access the `HandleWrap` instance from the
libuv handle, this introduced two issues:

- It implicitly assumed that casting
  `MessagePort*` → `void*` → `HandleWrap*` would be valid.
- It made the `HandleWrap::OnClose()` function fail with a
  `nullptr` dereference if the constructor did fail.

In particular, the second issue made
test/parallel/test-worker-cleanexit-with-moduleload.js` crash at
least once in CI.

Since re-assigning `async_.data` isn’t actually necessary here
(only a leftover from earlier versions of that commit), fix this by
using a local variable instead, and add a `CHECK` that provides better
error messages for this type of issue in the future.

Refs: nodejs#31605
@nodejs-github-bot nodejs-github-bot added the c++ Issues and PRs that require attention from people who are familiar with C++. label Feb 8, 2020
@addaleax addaleax added the worker Issues and PRs related to Worker support. label Feb 8, 2020
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

nodejs-github-bot commented Feb 8, 2020

CI: https://ci.nodejs.org/job/node-test-pull-request/29024/ (:white_check_mark:)

@addaleax addaleax added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Feb 8, 2020
@addaleax
Copy link
Member Author

Landed in f938cbd

addaleax added a commit that referenced this pull request Feb 10, 2020
a555be2 re-assigned `async_.data` to indicate success
or failure of the constructor. As the `HandleWrap` implementation
uses that field to access the `HandleWrap` instance from the
libuv handle, this introduced two issues:

- It implicitly assumed that casting
  `MessagePort*` → `void*` → `HandleWrap*` would be valid.
- It made the `HandleWrap::OnClose()` function fail with a
  `nullptr` dereference if the constructor did fail.

In particular, the second issue made
test/parallel/test-worker-cleanexit-with-moduleload.js` crash at
least once in CI.

Since re-assigning `async_.data` isn’t actually necessary here
(only a leftover from earlier versions of that commit), fix this by
using a local variable instead, and add a `CHECK` that provides better
error messages for this type of issue in the future.

Refs: #31605

PR-URL: #31696
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: David Carlier <devnexen@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@addaleax addaleax closed this Feb 10, 2020
@addaleax addaleax deleted the fix-messageport-dtor branch February 10, 2020 16:29
codebytere pushed a commit that referenced this pull request Feb 17, 2020
a555be2 re-assigned `async_.data` to indicate success
or failure of the constructor. As the `HandleWrap` implementation
uses that field to access the `HandleWrap` instance from the
libuv handle, this introduced two issues:

- It implicitly assumed that casting
  `MessagePort*` → `void*` → `HandleWrap*` would be valid.
- It made the `HandleWrap::OnClose()` function fail with a
  `nullptr` dereference if the constructor did fail.

In particular, the second issue made
test/parallel/test-worker-cleanexit-with-moduleload.js` crash at
least once in CI.

Since re-assigning `async_.data` isn’t actually necessary here
(only a leftover from earlier versions of that commit), fix this by
using a local variable instead, and add a `CHECK` that provides better
error messages for this type of issue in the future.

Refs: #31605

PR-URL: #31696
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: David Carlier <devnexen@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@codebytere codebytere mentioned this pull request Feb 17, 2020
@codebytere
Copy link
Member

@addaleax if this is something that should be in v12.x, could you please backport it there? There are a nest of conflicts and i want to be extra careful 😅

@targos targos removed author ready PRs that have at least one approval, no pending requests for changes, and a CI started. backport-requested-v12.x labels Apr 25, 2020
targos pushed a commit to targos/node that referenced this pull request Apr 25, 2020
a555be2 re-assigned `async_.data` to indicate success
or failure of the constructor. As the `HandleWrap` implementation
uses that field to access the `HandleWrap` instance from the
libuv handle, this introduced two issues:

- It implicitly assumed that casting
  `MessagePort*` → `void*` → `HandleWrap*` would be valid.
- It made the `HandleWrap::OnClose()` function fail with a
  `nullptr` dereference if the constructor did fail.

In particular, the second issue made
test/parallel/test-worker-cleanexit-with-moduleload.js` crash at
least once in CI.

Since re-assigning `async_.data` isn’t actually necessary here
(only a leftover from earlier versions of that commit), fix this by
using a local variable instead, and add a `CHECK` that provides better
error messages for this type of issue in the future.

Refs: nodejs#31605

PR-URL: nodejs#31696
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: David Carlier <devnexen@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
targos pushed a commit that referenced this pull request Apr 28, 2020
a555be2 re-assigned `async_.data` to indicate success
or failure of the constructor. As the `HandleWrap` implementation
uses that field to access the `HandleWrap` instance from the
libuv handle, this introduced two issues:

- It implicitly assumed that casting
  `MessagePort*` → `void*` → `HandleWrap*` would be valid.
- It made the `HandleWrap::OnClose()` function fail with a
  `nullptr` dereference if the constructor did fail.

In particular, the second issue made
test/parallel/test-worker-cleanexit-with-moduleload.js` crash at
least once in CI.

Since re-assigning `async_.data` isn’t actually necessary here
(only a leftover from earlier versions of that commit), fix this by
using a local variable instead, and add a `CHECK` that provides better
error messages for this type of issue in the future.

Refs: #31605

PR-URL: #31696
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: David Carlier <devnexen@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@addaleax
Copy link
Member Author

@targos … this seems to have ended up on v12.x-staging before the v12.16.3 release, but not in the release itself. It fixes a segfault that is now occurring with v12.16.3 – is there anything I could have done differently here to prevent that?

(I do know that I kind of dropped the ball on backporting this, but as you noticed, that was only necessary because other commits hadn’t landed yet, so I don’t think it would have made a difference?)

@targos
Copy link
Member

targos commented Apr 28, 2020

@addaleax this wasn't on v12.x-staging before the release. I just updated the branch right after it.

is there anything I could have done differently here to prevent that?

I think the most reliable way would have been to label the original PR as backport-requested with a message indicating that it had to land with this one.

I'm definitely open to other ideas. The real issue is that we don't have a documented and easy way to express relations between pull requests.

@addaleax
Copy link
Member Author

@targos Yeah, I’d be a fan of having something more specific than Refs:, e.g. Must-Be-Released-With: <pr-url> that would indicate that PRs should have to land together… I’ll put this on my list of things that I can try to implement

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. worker Issues and PRs related to Worker support.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants