Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Official 5.6.0 binaries crash on Ubuntu Trust 14.04 x64 #5185

Closed
cjthompson opened this issue Feb 10, 2016 · 21 comments
Closed

Official 5.6.0 binaries crash on Ubuntu Trust 14.04 x64 #5185

cjthompson opened this issue Feb 10, 2016 · 21 comments
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. dns Issues and PRs related to the dns subsystem.

Comments

@cjthompson
Copy link

I have an amazon instance running Ubuntu 14.04 Trusty x86_64. When I download the "Linux Binaries" from the nodejs.org homepage, the node binary will crash just trying to open the REPL.

If I run node -v I get 5.6.0. If I run just node I get

node: ../src/cares_wrap.cc:1262: void node::cares_wrap::Initialize(v8::Local<v8::Object>, v8::Local<v8::Value>, v8::Local<v8::Context>): Assertion `(r) == (0)' failed.
Aborted (core dumped)
@mscdex
Copy link
Contributor

mscdex commented Feb 10, 2016

Does the official v5.5.0 binary work?

@mscdex mscdex added dns Issues and PRs related to the dns subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. labels Feb 10, 2016
@mscdex
Copy link
Contributor

mscdex commented Feb 10, 2016

/cc @indutny

@bnoordhuis
Copy link
Member

Can't reproduce with the x86 and x64 binaries on FC23 but f1a0827 is the obvious suspect.

@cjthompson Can you try building from source with and without that commit?

@indutny
Copy link
Member

indutny commented Feb 10, 2016

This is very interesting crash, especially considering that we smoke test the binaries.

@cjthompson may I ask you to share it /etc/resolv.conf with us, if it doesn't contain any private information? Additionally, may I ask you to run it with strace node and gist logs?

Thank you so much for reporting this.

@cjthompson
Copy link
Author

The issue appears to be related to /etc/nsswitch.conf not being world-readable. Can anyone confirm if this makes sense? The 5.5.0 binary on the same server worked just fine.

EDIT: Might be related to this issue: c-ares/c-ares#33

@cjthompson
Copy link
Author

strace log: https://gist.github.com/cjthompson/5570c04744715f96dcb5

it's pretty clear at the bottom of the log there's a call to open /etc/nsswitch.conf and permission is denied at which point node promptly crashes.

@indutny
Copy link
Member

indutny commented Feb 10, 2016

@cjthompson it looks like there were no lookup entries in /etc/resolv.conf and because of this is cares tried to read nsswitch.conf.

@indutny
Copy link
Member

indutny commented Feb 10, 2016

@cjthompson and thank you very much for strace log, it really helped!

@cjthompson
Copy link
Author

At this point I think we've narrowed down the exact cause and the fact that it was a change from 5.5.0 to 5.6.0.

Let me know if there's anything more you need me to debug.

@bnoordhuis
Copy link
Member

@indutny Want to revert for now? I believe the next v5 release is next week and it'd be a shame to have it broken in two subsequent releases. (I'm assuming we won't have a fix upstreamed in that time.)

@indutny
Copy link
Member

indutny commented Feb 10, 2016

@bnoordhuis let's revert it until resolving this.

@cjthompson may I ask you to do strace thing with 5.5.0 too? Many thanks in advance.

@cjthompson
Copy link
Author

strace of 5.6.0 with nsswitch.conf permissions set to 644 https://gist.github.com/cjthompson/6243cfc5ff1063928c90

strace of 5.5.0 with nsswitch.conf permissions set to 600
https://gist.github.com/cjthompson/2889a83e10e1b0d3ad2e

@indutny
Copy link
Member

indutny commented Feb 10, 2016

Whoa, finally found the reason why behavior is different:

status2 = init_by_defaults(channel);
if (status2 != ARES_SUCCESS) {
DEBUGF(fprintf(stderr, "Error: init_by_defaults failed: %s\n",
ares_strerror(status)));
if (status == ARES_SUCCESS)
status = status2;
}
/* Generate random key */
if (status == ARES_SUCCESS) {
status = init_id_key(&channel->id_key, ARES_ID_KEY_LEN);
if (status == ARES_SUCCESS)
channel->next_id = ares__generate_new_id(&channel->id_key);
else
DEBUGF(fprintf(stderr, "Error: init_id_key failed: %s\n",
ares_strerror(status)));
}

Basically, 5.5.0 is swallowing that lookup init error. @cjthompson may I ask you to run this js snippet:

require('dns').reverse('8.8.8.8', function() {console.log(arguments)})

on v5.5.0 with strict permissions on nsswitch.conf (the ones on which v5.6.0 crashes)? I expect that it will either error or (more likely) crash.

@indutny
Copy link
Member

indutny commented Feb 10, 2016

@bnoordhuis please disregard my comment about reverting it, looks like this behavior change is actually a pretty important bug fix. At least let's wait for results of dns.reverse testing that I have just asked @cjthompson about.

@cjthompson
Copy link
Author

Node 5.5.0 with nsswitch.conf permissions as 600:

QueryReqWrap {
  bindingName: 'getHostByAddr',
  callback: { [Function: asyncCallback] immediately: true },
  hostname: '8.8.8.8',
  oncomplete: [Function: onresolve] }
{ '0':
   { [Error: getHostByAddr ENOTFOUND 8.8.8.8]
     code: 'ENOTFOUND',
     errno: 'ENOTFOUND',
     syscall: 'getHostByAddr',
     hostname: '8.8.8.8' } }

node 5.5.0 and 5.6.0 with nsswitch.conf set to 644:

QueryReqWrap {
  bindingName: 'getHostByAddr',
  callback: { [Function: asyncCallback] immediately: true },
  hostname: '8.8.8.8',
  oncomplete: [Function: onresolve] }
{ '0': null, '1': [ 'google-public-dns-a.google.com' ] }

@indutny
Copy link
Member

indutny commented Feb 10, 2016

Ah, this is actually better than I expected. Thank you so much, @cjthompson !

@bnoordhuis looks like we will need to revert it after all. Sounds like a plain bug in c-ares. It is caused by c-ares/c-ares@46bb820 , and proposed fix is: c-ares/c-ares#39 .

We can always re-land the c-ares update after c-ares/c-ares#39 will land.

@indutny
Copy link
Member

indutny commented Feb 11, 2016

The patch has been landed c-ares/c-ares@e514088 . I'm going to wait for reply from c-ares team with regards to use of c-ares master branch in our repo, and depending on it we will be able to decide what to do next.

@indutny
Copy link
Member

indutny commented Feb 11, 2016

Alright, I have a green light from c-ares team. @bnoordhuis I'm going to submit PR for cherry-picking that commit as soon as I will wake up (/me is sleeptyping at the moment).

@indutny
Copy link
Member

indutny commented Feb 11, 2016

Should be fixed by #5199

bnoordhuis added a commit to bnoordhuis/io.js that referenced this issue Feb 15, 2016
This reverts commit 35c3832.

See [0] and [1] for background.  Let's hold off on upgrading c-ares
until upstream makes an official release.

[0] nodejs#5185
[1] nodejs#5199

PR-URL: nodejs#5199
Reviewed-By: Fedor Indutny <fedor@indutny.com>
@jasnell
Copy link
Member

jasnell commented Apr 19, 2016

Is this still an issue? Do we need to keep this open?

@bnoordhuis
Copy link
Member

I'll close. I forgot to after landing the revert.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. dns Issues and PRs related to the dns subsystem.
Projects
None yet
Development

No branches or pull requests

5 participants