Skip to content
kripken edited this page Jun 19, 2012 · 99 revisions

#FAQ

General

  • Q. What does Emscripten do?

    A. Emscripten compiles LLVM bytecode into JavaScript, which then allows:

    • Compiling C/C++ and other code that can be translated into LLVM, directly into JavaScript.
    • Compiling the C/C++ runtimes of other languages into JavaScript, and then running code in those other languages in an indirect way. This works for languages like Python and Lua.
  • Q. Why are you doing this?

    A. The web is standards-based, cross-platform, runs everywhere from PCs to iPads, and has numerous independent compatible implementations. It's arguably the best platform to develop for, for those reasons. But it could be even more developer-friendly: While JavaScript (when used well!) is an excellent language, lots of people want to code in other languages. By compiling to JavaScript, everyone is happy.

  • Q. What is the status of Emscripten?

    A. Emscripten is mature and has been used to port a very long list of real-world codebases to JavaScript, including large projects like CPython, Poppler and Bullet. You can see some demos here.

    However, there are some unavoidable limitations, since JavaScript is not native code: CodeGuidlinesAndLimitations

  • Q. How fast will the compiled code be?

    A. Right now generated code is around 3-4 times slower than gcc -O3, however, note that there are substantial differences between benchmarks, and in some cases JS engine bugs cause significantly poorer performance. See http://syntensity.com/static/splashpres.pdf (slides from a SPLASH 2011 presentation) for more details. docs/paper.pdf also has some numbers, but they are out of date.

    Emscripten-compiled code can run at similar speeds to handwritten JavaScript code in many benchmarks. While hand-crafted code can in theory do everything Emscripten does and more, in practice such code is written for clarity. Compilers, like Emscripten or gcc, emit fast code which is not necessarily easy to read, hence in some cases compiled code can be faster than handwritten code.

    To run the emscripten benchmarks yourself, do python tests/runner.py benchmark.

  • Q. How big will the compiled code be?

    A. The effective size of the code will be about the same as native code. That is, if you gzip your code, it will be about the same size as gzipped native code. For more, see this blog post.

  • Q. What is the compiler written in?

    A. JavaScript. Paralleling the language we are generating code for has various benefits. For example, if we determine some expression can be known at compile time, we can do evaluate it immediately in the compiler; otherwise we can simply JSON.stringify() it for the generated code to solve at runtime. Also, (nice) JavaScript is cool.

  • Q. Isn't it better just to write JavaScript code? Why compile LLVM into JavaScript?

    A. By all means write new JavaScript code. Emscripten is just another option to have, and will hopefully be useful if you have a lot of C/C++ code that you don't want to rewrite from scratch. You can still write web applications normally, but Emscripten lets you integrate existing C/C++ code when useful.

  • Q. Where does Emscripten itself run?

    A. Emscripten is known to work on Windows, OS X and Linux. Note however that currently the automatic tests are run mainly on Linux. Help with supporting other platforms would be very welcome.

    That's for the compiler, of course, the generated code is valid JavaScript, so it will run anywhere JavaScript can run. By default typed arrays are used (both for code compatibility and speed), however typed arrays are not universally supported yet, so Emscripten lets you also generate code without typed arrays, which will run practically everywhere.

  • Q. What APIs/libraries does Emscripten support?

    A. libc and stdlibc++ support is very good. SDL support is sufficient to run quite a lot of code. OpenGL support is in progress.

  • Q. Is this really a compiler? Isn't it better described as a translator?

    A. Well, a compiler is usually defined as a program that transforms source code written in one programming language into another, which is what Emscripten does. A translator is a more specific term that is usually used for compilation between high-level languages, which isn't exactly applicable. On the other hand a decompiler is something that translates a low-level language to a higher-level one, so that might technically be a valid description, but it sounds odd since we aren't going back to the original language we compiled from (C/C++, most likely) but into something else (JavaScript).

  • Q. The name of the project sounds weird to me.

    A. I don't know why; it's a perfectly cromulent word!

Using Emscripten

  • Q. How do I compile code?

    A. See the Tutorial.

  • Q. I get lots of errors building the tests.

    A. Some common problems are:

    • Using older versions of Node or JS engines. Use the versions mentioned in the Tutorial.
    • Using older versions of LLVM. The recommended LLVM version is the 3.0 release. Using LLVM trunk might or might not work.
    • Typos in the paths ~/.emscripten.
  • Q. Can I compile my project using Emscripten? Do I need a new build system?

    A. You can in most cases very easily use your project's current build system with Emscripten. See Building-Projects.

  • Q. My code compiles slowly.

    A. Emscripten makes some tradeoffs that make the generated code faster and smaller, at the cost of longer compilation times. For example, we build parts of the standard library along with your code which enables some additional optimizations, but takes a little longer to compile.

    Optimization in particular can in some cases be noticeably slower than unoptimized code, -O1 is slower than -O0, which in turn is slower than -O2 (in return, though, they greatly improve the speed of the generated code). It might be useful to use -O0 (or not specify an optimization level) during quick development iterations and to do fully optimized builds less frequently.

    Currently builds with line-number debug info (where the source code was compiled with -g) are slow, see issue #216. Stripping the debug info leads to much faster compile times.

  • Q. When I compile code that should work, I get odd errors in Emscripten about various things. I get different errors (or it works) on another machine.

    A. Make sure you are using the Emscripten bundled system headers. Using emcc will do so by default, but if you compile into LLVM bitcode yourself, or you use your local system headers even with emcc, problems can happen.

  • Q. My code fails to compile, the error includes something about inline assembly (or {"text":"asm"}).

    A. Emscripten cannot compile inline assembly code, which is CPU specific, because Emscripten is not a CPU emulator.

    Many projects have build options that generate only platform-independent code, without inline assembly. That should be used for Emscripten. For example, the following might help (and are done automatically for you by emcc):

    #undef __i386__
    #undef __x86_64__
    

    Since when no CPU-specific #define exists, many projects will not generate CPU specific code. In general though, you will need to find where inline assembly is generated, and how to disable that.

  • Q. How do I run an event loop?

    A. To run a C function repeatedly, use emscripten_set_main_loop, see system/include/emscripten.h. The other functions in that file are also useful.

    To respond to browser events and so forth, use the SDL API normally. See the SDL tests for examples (look for SDL in tests/runner.py).

  • Q. My SDL app doesn't work.

    A. See the SDL automatic tests for working examples: python tests/runner.py browser.

  • Q. My SDL app hangs.

    A. C++ SDL apps typically have a main loop that is an infinite loop, in which event handling is done, processing and rendering, then SDL_Delay. However, in JS there is no way for SDL_Delay to actually return control to the browser event loop. To do that, you must exit the current code.

    The proper way to do this is make a C function that runs one iteration of the main loop. Then call it from JS at the proper frequency. This is very simple to do manually (just call it from JS, all you need is an underscore at the beginning of the name), but you can also use emscripten_set_main_loop (see emscripten.h) for something a little more convenient.

  • Q. How can my compiled program access files?

    A. Emscripten uses a virtual file system that may be preloaded with data or linked to URLs for lazy loading. See the Filesystem Guide for more details.

  • Q. I get an error trying to access __tm_struct_layout (or another C structure used in libc).

    A. You may need to compile the source code with emcc -g. -g tells the compiler to include debug info, which includes metadata about structures which is used to access those structures from Emscripten's JS libc implementation. (Adding -g is a workaround until we have a proper fix for this.)

  • Q. Functions in my C/C++ source code vanish when I compile to JavaScript..?

    A. By default Emscripten does dead code elimination to minimize code size. However, it might end up removing functions you want to call yourself, that are not called from the compiled code (so the LLVM optimizer thinks they are unneeded). If there is no main() function, then by default unused functions are not removed since there is no way to tell which are which, but if there is a main, you may see some functions removed. If you must for some reason have a main() function, you can run emcc with -s LINKABLE=1 which will disable link-time optimizations. Alternatively, you can prevent specific functions from being eliminated by marking them in the C/C++ source code with __attribute__((used)) (that is a gcc extension that clang supports).

    Another issue is that the function may be renamed or removed by the Closure Compiler (which runs in -O2 and above by default). To avoid that, see the EXPORTED_FUNCTIONS option in src/settings.js (so running emcc with something like -s EXPORTED_FUNCTIONS="['_main', '_my_func']" will prevent my_func from being removed/renamed).

    It can be useful to compile with EMCC_DEBUG=1 (EMCC_DEBUG=1 emcc ..). Then the compilation steps are split up and saved in /tmp/emscripten_temp. You can then see at what stage the code vanishes (you will need to do llvm-dis on the bitcode stages to read them, or llvm-nm, etc.).

    One possible cause of vanishing code is an LLVM LTO bug. If that happens, you will see the code vanish in the LTO stage when using EMCC_DEBUG=1. You can turn LTO off with --llvm-lto 0 passed to emcc , or setting LINKABLE to 1 as mentioned before.

  • Q. The FS API is not available in -O2 and above, how can I use it?

    A. Closure compiler will minify the FS API code in -O2 and above. To write code that uses it, it must be optimized with the FS API code by closure. To do that, use emcc's --pre-js option, see emcc --help.

  • Q. My code breaks with -O2 and above, giving odd errors..?

    A. The likely problem is that Closure Compiler (which runs in -O2 and above by default) minifies variable names. Names like i,j,xa can be generated, and if other code has such variables in the global scope, bad things can happen.

    To check if this is the problem, compile with -O2 --closure 0. If that works, name minification might be the problem. If so, wrapping the generated code in a closure should fix it. (Or, wrap your other code in a closure, or stop it from using small variable names in the global scope, you might be using such variables by mistake by forgetting a var and assigning to a variable - which makes it be in the global scope.)

    To 'wrap' code in a closure, do something like this:

var CompiledModule = (function() {
  .. GENERATED CODE ..
  return Module;
})();
  • Q. I get an odd python error complaining about libcxx.bc or libcxxabi.bc..?

    A. Possibly building libcxx or libcxxabi failed. Go to system/lib/libcxx (or libcxxabi) and do emmake make to see the actual error.

  • Q. Running LLVM bitcode generated by emcc through lli breaks with errors about impure_ptr stuff..?

    A. First of all, lli is not maintained (sadly) and has odd errors and crashes. However there is tools/nativize_llvm.py which compiles bitcode to a native executable. It will also hit the impure_ptr error though.

    The issue is that newlib uses that impure pointer stuff, while glibc uses something else. So bitcode build with the emscripten SDK (which emcc does) will not run locally, unless your machine uses newlib (which basically only embedded systems do). The impure_ptr stuff is limited, however, it only applies to explicit use of stdout etc. So printf(..) will work, but fprintf(stdout, ..) will not. So often it is simple to modify your code to not hit this problem.

Clone this wiki locally