Description
LLVM will intrinsically lower some operations to a function call at runtime, for example this program:
#![crate_type = "lib"]
pub fn foo(a: f32) -> f32 {
a.log2()
}
generates IR that looks like:
define float @foo(float %a) unnamed_addr #0 {
start:
%0 = tail call float @llvm.log2.f32(float %a) #2
ret float %0
}
note the lack of call to a function named log2
, it's just an LLVM intrinsic. On Linux, however, at runtime this function is compiled as calling the symbol log2f
, usually found in libm.
It can be important in some situations to use the intrinsics rather than explicit functions themselves as LLVM may optimize better, but other times LLVM may optimize a program to using an intrinsic without the original source mentioning the intrinsic. In other words, these function calls can sometimes run a risk of being inserted even if they're not used!
Currently LLVM will translate this in wasm as an import from the env
module with the same name as these functions have in C. Additionally the standard library has a set of symbols that it imports and uses. (if the associated functions are called).
Today projects like wasm-bindgen will automatically fill in these imports so users don't have to worry about them, but that's not necessarily great! The method in which these intrinsics are exposed today may not be stable and could change tomorrow.
Ok so that brings us to the question, what do we do about these math-related intrinsics? some options:
- We could prevent any access at all to these intrinsics. Downsides of this approach are that simple operations like
a % b
wouldn't work for floats. Additionally I'm not sure that if we forbid access that it'd actually work, for example are we sure that LLVM won't optimize otherwise normal code to have these imports? - Define these functions for bundlers/wasm instantiators. We could simply define that these functions may be imported, and we'd also define "here's the symbol they're gonna use and here's the expected behavior". The good news is that everything here I believe is pretty well defined in terms of what it's supposed to do (aka C header files and whatnot). The downside is that there's a lot of these to work with (aka see wasm-bindgen's list)
- Polyfill everything in rustc as a postprocessing step over the wasm. This would entail having a software solution available for all of these intrinsics (not that I have any idea how to write
atan
) and rustc would inject an implementation into the wasm. The good news here is that everything's "always taken care of", the bad news is we're probably filling in a much worse implementation ofsin
thanMath.sin
, not to mention that it's probably pretty big code-size-wise.
I'm curious to hear what others think! I'm sort of tempted to take the second route here, defining these functions for bundlers/wasm instantiators. We could, for example, define that these functions will always be imported from something like __rust_math
(requiring rustc to postprocess the wasm file) and they've all got their C-like names (reverting libstd-specific names)