-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault after small patch #650
Comments
Update: I've updated my system, rebooted my computer and done a full rebuild. The segfault has changed slightly: now I occasionally get a message like:
and the stacktrace has changed
|
Hi @derekdreery! Thanks for the bug report! What OS do you have? I have no problem running You might have some problem with curl on your OS. Maybe @alexcrichton has some ideas? |
Yeah this unfortunately looks like an issue with the build process or the built binary, probably unrelated to wasm-pack itself. The prebuilt binaries should likely work for you though. |
The interesting thing is that the program works fine without the changes I made. So it seems that the changes introduce the error. I've just done a sys upgrade so I will re-test to see if the problem has gone. |
@derekdreery are you still running into this? let us know! |
Hey, I'll have another go! :) |
I've still got the issue, but I'm also seeing it with the master branch, so it's unrelated to my patch. It's probably a version mismatch somewhere between a library and its bindings. |
Just to follow up, it's fine to close this issue. I'd still personally like to get to the bottom of what's happening, but I lack the skills |
I'm seeing what's probably the same segfault. It looks like a race condition within openssl triggered by exiting (running atexit handlers) while the thread spawned by Even if openssl handled shutdown mid-initialization, I suspect you'd still be in trouble if the version check thread gets as far as actually using openssl while the main thread exits the process. It'd be nice if wasm-pack could wait for that other thread before exiting, but unless it can interrupt the check in flight I'm guessing that'd be a bit slow... Haven't investigated what it would take to interrupt the check and exit cleanly. I'm also seeing this with a locally built wasm-pack (from revision 30a95a4, built with Rust 1.35.0, with a local change that shouldn't matter if we exit this early). I haven't investigated if official builds of wasm-pack are less crashy (or why that might be). #653 (and any other changes that rate-limit the version check) might mostly hide this issue. The exact stack of the crashing thread varies, but usually looks like:
There's another thread that looks like:
or:
Looking at openssl (I'm using 1.1.0k), it uses pthread_once (or equivalent) to run its initialization function once, and registers an atexit handler that tears everything down quite early in that initialization. We're crashing trying to use a static lock in o_names.c that got nulled out by The backtraces from @derekdreery also contain openssl initialization functions and also occur when exiting immediately, so I think it's the same crash. The curl-sys assert was curl failing to initialize. I haven't confirmed it has a similar race (and handles it better by failing initialization instead of crashing) but it seems plausible. |
@marienz great work, I didn't know enough about ssl/curl to work out what was going on. |
I've now recreated this with the pre-built binary. |
So basically, if you initialize openssl in a thread that's not the main thread, you can get a race condition between an atexit handler on the process, and the initialization function. So you must run @marienz does that sound right? |
I don't think this is limited to initialization. I think if we run atexit handlers on the main thread while another thread is still using openssl (either initializing it or actually using it to talk to something over the network) we'll potentially crash. Moving openssl initialization to the main thread might end up making this harder to hit, because anything involving the network is most likely slower than running atexit handlers and exiting. If the version-checking thread is blocked in a network-related syscall while we exit, we shouldn't crash. But I'd argue the "proper" fix here is to just join the version-checking thread from the main thread, after making sure wasm-pack does not do its version check more often than necessary (#653 and related issues). This will make wasm-pack a little slower to exit when it decides it has to do the version check and it doesn't have to do any work at the same time, but it should be pretty rare to hit both of those conditions simultaneously (for example: you'd never see this twice in a row). And if wasm-pack doing actual work is faster than the version check, waiting for the version check seems like a good idea (we do want that check to go through every now and then... if wasm-pack is consistently "too fast" the version check will never happen). |
Hi everyone! I ran into a similar issue with 0.9.1. The race condition surfaces when My configuration:
|
I'm happy to report that with LibreSSL 3.0.2, the race conditions I experienced are gone in both release and debug modes. |
@dhl thanks for letting us know!! do you think we should document this somewhere? |
I would love to see this documented somewhere, but maybe it's a bit too early to say LibreSSL fixes the segfaults before others could confirm the same? Thanks for the great work on wasm-pack, by the way! ❤️ |
I had a quick look to see if there's an obvious difference between LibreSSL and OpenSSL, or if it's just more timing changes hiding the problem. There is a real difference, but I don't know if it's intentional or not. Comparing OpenSSL 1.1.1d and LibreSSL 3.0.2, the important difference is that I see no atexit() handlers in LibreSSL. However, that might be because LibreSSL currently does not need them, rather than by design. The lock whose cleanup triggered the segfault I saw before is also not present in libressl. It looks like this lock was introduced in openssl relatively recently (openssl/openssl#3525), and (assuming https://github.com/libressl-portable/openbsd/commits/master/src/lib/libcrypto/objects/o_names.c is current) libressl has not yet picked up those changes. So it's possible libressl still has that race condition, and that if they choose to fix it the same way openssl did it'll start crashing wasm-pack the same way. (Or if they start using atexit() for cleanup for other reasons.) Unless someone confirms libressl intentionally does not use atexit handlers I wouldn't recommend LibreSSL as the fix for this one. |
I don't know if this is the same issue, but I also just got a segfault when running wasm-pack with no arguments:
This was on a fresh install; I've never used wasm-pack before, and I literally just did |
Hi, I tried cloning this repo and compiling from master, then running I'm running Garuda, Arch-based distro: This was working before when I was using the version that comes with Arch: https://archlinux.org/packages/community/x86_64/wasm-pack/ It goes back to working when I do a |
If you're segfaulting only on specific Please share a backtrace for the segfault (and if it doesn't look like this bug, probably best to file a new one and CC me if you want me to look at the backtraces). |
I was able to boil the segfault down to this minimal repro: # Cargo.toml
[package]
name = "wasm-pack-segfault-repro"
version = "0.1.0"
edition = "2021"
[dependencies]
curl = "0.4.44"
openssl-sys = { version = "0.9.80", features = ["vendored"] }
reqwest = { version = "0.11.14", features = ["blocking"] } // main.rs
use curl::easy;
fn main() {
// This code comes from `wasm_pack::manifest::Crate::check_wasm_pack_latest_version`.
struct NoopHandler;
impl easy::Handler for NoopHandler {
fn write(&mut self, data: &[u8]) -> Result<usize, easy::WriteError> {
Ok(data.len())
}
}
let mut easy = easy::Easy2::new(NoopHandler);
easy.url("https://example.com").unwrap();
easy.get(true).unwrap();
// The segfault happens here.
easy.perform().unwrap();
// This code comes from `wasm_pack::install::krate::Krate::new`.
// It's never reached, but the segfault doesn't happen if it's not here.
let _ = reqwest::Client::builder();
} This is the backtrace:
I'm still not really sure why it's happening, though, and particularly why including the bit of code referencing |
The segfault only happens when vendored OpenSSL is enabled, though, so running |
Apparently this is only a problem on the |
Your example does not crash for me on Fedora 37. You're using vendored OpenSSL, but it looks like you're using your system's libcurl, which seems to be linked against your system's OpenSSL's libcrypto. Mixing vendored and system OpenSSL might be a contributing factor to your crash. (Looks like the curl crate uses system libcurl if available and silently falls back to a vendored copy of curl if not... I installed libcurl-devel and confirmed I then pull in system libcurl and libcrypto, but still didn't crash). |
I can confirm using @Liamolucko's MRE this also segfaults on my system, with both |
Issue is described in: - <rustwasm/wasm-pack#650> - <rustwasm/wasm-pack#1234>
🐛 Bug description
I'm getting a segfault when I run
wasm-pack
with no arguments. I've made some small non-unsafe changes to the codebase.Sometimes, instead of a segfault, I get a failed assertion in curl-sys:
which is here: curl assert fail.
Here is the stack trace from the segfault core dump
👟 Steps to reproduce
wasm-pack
(e.g.cargo run
)🌍 Your environment
wasm-pack version: latest git
rustc version: 1.34.2
The text was updated successfully, but these errors were encountered: