LLVM Backend

The original emscripten compiler was written in JavaScript, which was very useful for quickly prototyping new ideas during development of the various new methods needed for effective compilation to JavaScript (the relooper, longjmp tricks, C++ exceptions in JS, etc.). It is also quite stable at this point and generates very good code. However, it has a few downsides:

Compiler speed. The generated code is fast, but generating the code is not so fast. Especially with full optimizations on, builds can be quite slow. This is not an issue for tens of thousands of lines of code, and is annoying but not horrible for hundreds of thousands, but it a serious problems for millions.
LLVM backends integrate more closely with LLVM, and can leverage LLVM's internal code analysis and optimization. The original compiler just parses LLVM bitcode externally, so it cannot benefit from internal capabilities of LLVM.
An upstream LLVM backend is easier to use for people than a separate project. Compiling to JS should, as much as possible, be just another backend in a compiler.

The plan is to start work over Summer 2012.

Guidelines and issues:

We will use the C++ Relooper implementation https://github.com/kripken/Relooper
Focus on the C-style memory layout method. Other approaches (no typed arrays, unaliasing typed arrays) will only be done by the original compiler.
When possible, do native JS function calls f(x,y,z) and not read/writes from the C stack. Tricky with varargs but perhaps possible even there with internal LLVM changes.
Far better to do x = (a+b)/z instead of t = a+b ; x = t/z, unclear how easy it is to do that in an LLVM backend.
More advanced C++ static analysis than the current compiler should allow removal of a lot of unnecessary address shifting
See https://bugzilla.mozilla.org/show_bug.cgi?id=771106 for some optimizations we should implement. Also https://bugzilla.mozilla.org/show_bug.cgi?id=771285#c5
To get started we will not create an object format for JavaScript, we can continue to use the emcc wrapper which uses clang in a way that utilizes LLVM bitcode as the intermediate object format. So the initial goal is just to generate JS in the backend directly, that is, from LLVM IR in memory.
Some initial work by Ehsan on Emscripten support in LLVM and clang are in
https://github.com/ehsan/llvm/commit/ad4c8c52f68a1694cbb66fe861f325928ca04d7c
https://github.com/ehsan/clang/commit/3a8eff2f5646605d949222032422a12967b34790
LLVM already has a target triple ArchType of le32 with comment generic little-endian 32-bit CPU (PNaCl / Emscripten), we should presumably use that?
Of the existing backends, the simplest is CppBackend, but it might be too simple. Sparc seems to be the smallest "real" backend.
Should we call this+Emscripten Emscripten 2.0?
Should we call the LLVM backend itself "JS" or "Emscripten" internally in LLVM?

Setup

We track LLVM and Clang svn through their git mirror http://llvm.org/docs/GettingStarted.html#git_mirror
Our repos are
https://github.com/kripken/llvm-js
https://github.com/kripken/clang-js
Updating from svn: Pull from the git mirrors with --rebase (as also recommended on the LLVM link above). Then push to our github repos.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LLVM Backend

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally