Skip to content

Commit 392f09c

Browse files
committed
Add new blog post.
1 parent 7f5af7c commit 392f09c

File tree

1 file changed

+237
-0
lines changed

1 file changed

+237
-0
lines changed
Lines changed: 237 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,237 @@
1+
---
2+
layout: post
3+
title: "Surprises in GohperJS Performance"
4+
date: 2015-09-27
5+
author: Dmitri Shuralyov
6+
---
7+
8+
The GohperJS project first caught my attention about 2 year ago, back when few parts of the Go spec were implemented. However, I noticed the incredible pace at which Richard was working, making multiple sophisticated commits per day, as well as fixing reported compiler issues within hours. A few months later, I decided to download it and give it try on a relatively [large pure Go package](https://godoc.org/github.com/shurcooL/markdownfmt/markdown?import-graph&hide=1) for formatting Markdown, and I was quite shocked when it... [simply worked](https://github.com/shurcooL/atom-markdown-format/commit/6b5f21c4457309f8eba3a78b82e0c9a458ff13b4).
9+
10+
Since then, GopherJS has made significant progress, both in feature support (by now, full support for goroutines, channels, select statement, and the rest of the Go language spec), as well as quite a few performance leaps. For example, in [issue 142](https://github.com/gopherjs/gopherjs/issues/142), I reported a case where the GopherJS performance was pretty bad, taking nearly 30 seconds to do what native Go did in under 100 ms. Fast forward just a few days later, and Richard came up with optimizations that lead to a [10x improvement](https://github.com/gopherjs/gopherjs/issues/142#issuecomment-68664354) in performance!
11+
12+
One day, I was perusing the golang.org home page and decided to play with the [concurrent pi](https://play.golang.org/p/RdbPXQcZHi) sample. I wanted to see how much overhead using goroutines was (they were used to demonstrate how lightweight they are compared to threads, but it's still suboptimal for performance), so I converted the program to a purely iterative one. It looked like this:
13+
14+
```Go
15+
// Play with benchmarking a tight loop with many iterations and a func call, compare gc vs GopherJS performance.
16+
package main
17+
18+
import (
19+
"fmt"
20+
"math"
21+
"time"
22+
)
23+
24+
func term(k float64) float64 {
25+
return 4 * math.Pow(-1, k) / (2*k + 1)
26+
}
27+
28+
// pi performs n iterations to compute an approximation of pi using math.Pow.
29+
func pi(n int32) float64 {
30+
f := 0.0
31+
for k := int32(0); k <= n; k++ {
32+
f += term(float64(k))
33+
}
34+
return f
35+
}
36+
37+
func main() {
38+
// Start measuring time from now.
39+
started := time.Now()
40+
41+
const n = 50 * 1000 * 1000
42+
fmt.Printf("approximating pi with %v iterations.\n", n)
43+
fmt.Println(pi(n))
44+
45+
fmt.Println("total time taken is:", time.Since(started))
46+
}
47+
```
48+
49+
I ran the program on my computer:
50+
51+
```bash
52+
$ go run main.go
53+
approximating pi with 50000000 iterations.
54+
3.1415926735902504
55+
total time taken is: 8.358498915s
56+
```
57+
58+
8.35 seconds to perform 50 million iterations, not bad. Then I got curious how long it would take if compiled to JavaScript via GopherJS.
59+
60+
I realized that this is a very tight loop, so any overhead incurred by the conversion of Go to JavaScript would be multiplied and be very visible. Still, I was curious, so fired up GohperJS and ran the same program by compiling it to JavaScript and running it with node:
61+
62+
```bash
63+
$ gopherjs run main.go
64+
approximating pi with 50000000 iterations.
65+
3.1415926735902504
66+
total time taken is: 2.317s
67+
```
68+
69+
23 seconds, that's actually... wait, WHAT!?
70+
71+
2.3 seconds! That's 4 times faster than the native Go version. For a few minutes, I looked at the two numbers in disbelief. Then I decided to investigate what's going on. Is the same code running in both cases? Is the program correct? Is node doing something weird?
72+
73+
I tried running it in the [GopherJS Playground](http://www.gopherjs.org/playground/#/K7r0-q_Jwc), which you can also do:
74+
75+
http://www.gopherjs.org/playground/#/K7r0-q_Jwc
76+
77+
And got the same time in Chrome browser (stable channel).
78+
79+
The calculated value of pi was the same, and after adding some debugging statements I was sure the calculation was indeed correct, and iterations were not being skipped.
80+
81+
But how could it be that taking this Go program and compiling it to JavaScript and executing that would be 4 times faster? I had to get to the bottom of it.
82+
83+
The first thing I needed to ensure, was the same code being run in both cases? The entire code is plain Go, with the exception of `math.Pow`. So I looked at how [Go implements it](http://gotools.org/math#Pow). Pretty straightforward Go code. Now I knew GopherJS uses some JavaScript native APIs to implement parts of the standard library, so I checked how [it implemented `math.Pow`](https://github.com/gopherjs/gopherjs/blob/master/compiler/natives/math/math.go#L157). Aha! It's not the same code after all. GohperJS implements it by using [JavaScript's `Math` object](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math), so it translates to:
84+
85+
```JavaScript
86+
Math.pow(x, y)
87+
```
88+
89+
That's when it hit me. In this code, which was taken from a snippet that optimized for brevity and demonstration purposes rather than performance, `math.Pow` was being used with th first argument of -1, and the second argument are values 0, 1, 2, 3, etc., in sequence. The output of that is an alternating sequence of 1, -1, 1, -1, 1, -1, etc. But using `math.Pow` for that is extremely inefficient, since it's meant to work with arbitrary inputs that are much harder to calculate. This can be trivially rewritten with an if statement.
90+
91+
So, in order to ensure the same code runs in both cases, I did that and tried this program:
92+
93+
```Go
94+
// Play with benchmarking a tight loop with many iterations and a func call, compare gc vs GopherJS performance.
95+
//
96+
// An alternative more close-to-metal implementation that doesn't use math.Pow.
97+
package main
98+
99+
import (
100+
"fmt"
101+
"time"
102+
)
103+
104+
func term(k int) float64 {
105+
if k%2 == 0 {
106+
return 4 / (2*float64(k) + 1)
107+
} else {
108+
return -4 / (2*float64(k) + 1)
109+
}
110+
}
111+
112+
// pi performs n iterations to compute an approximation of pi.
113+
func pi(n int) float64 {
114+
f := 0.0
115+
for k := int(0); k <= n; k++ {
116+
f += term(k)
117+
}
118+
return f
119+
}
120+
121+
func main() {
122+
// Start measuring time from now.
123+
started := time.Now()
124+
125+
const n = 1000 * 1000 * 1000
126+
fmt.Printf("approximating pi with %v iterations.\n", n)
127+
fmt.Println(pi(n))
128+
129+
fmt.Println("total time taken is:", time.Since(started))
130+
}
131+
```
132+
133+
Let's try that:
134+
135+
```bash
136+
$ go run main.go
137+
approximating pi with 1000000000 iterations.
138+
3.1415926545880506
139+
total time taken is: 10.916861037s
140+
141+
$ gopherjs run main.go
142+
approximating pi with 1000000000 iterations.
143+
3.1415926545880506
144+
total time taken is: 6.585s
145+
```
146+
147+
I had to bump up the number of iterations to 1 billion, because this code runs so much faster than the naive `math.Pow`-using version, in both cases. But GopherJS version is still faster.
148+
149+
Aha, then I realized that [GopherJS emulates a 32-bit architecture](https://github.com/gopherjs/gopherjs#architecture). But I'm running native Go on a 64-bit machine. So the size of `int` is 32-bit for GopherJS code but 64-bit for Go code. Let's make it use `int32` consistently and try again:
150+
151+
```bash
152+
$ gopherjs run main.go
153+
approximating pi with 1000000000 iterations.
154+
3.1415926545880506
155+
total time taken is: 6.658s
156+
157+
$ gopherjs run main.go
158+
approximating pi with 1000000000 iterations.
159+
3.1415926545880506
160+
total time taken is: 6.549s
161+
```
162+
163+
As expected, the GopherJS time did not change because it was a no-op, but the native Go performance has now caught up to the GopherJS version!
164+
165+
Just to be sure, I wanted to see if 6.5 seconds was as fast as these 1 billion iterations could happen, even if you were to implement this in a low level language like C:
166+
167+
```C
168+
#include <stdio.h>
169+
#include <time.h>
170+
171+
double term(int k) {
172+
if (k%2 == 0) {
173+
return 4.0 / (2.0*(double)(k) + 1.0);
174+
} else {
175+
return -4.0 / (2.0*(double)(k) + 1.0);
176+
}
177+
}
178+
179+
// pi performs n iterations to compute an approximation of pi.
180+
double pi(int n) {
181+
double f = 0.0;
182+
for (int k = 0; k <= n; k++) {
183+
f += term(k);
184+
}
185+
return f;
186+
}
187+
188+
int main() {
189+
int n = 1000 * 1000 * 1000;
190+
printf("approximating pi with %d iterations.\n", n);
191+
printf("%.16f\n", pi(n));
192+
193+
return 0;
194+
}
195+
```
196+
197+
The [timing library](http://en.cppreference.com/w/c/chrono) of C isn't as friendly to use as the Go time package, so I gave up and just used `time` instead:
198+
199+
```bash
200+
$ gcc main.c
201+
$ time ./a.out
202+
approximating pi with 1000000000 iterations.
203+
3.1415926545880506
204+
205+
real 0m11.385s
206+
user 0m11.377s
207+
sys 0m0.006s
208+
```
209+
210+
11.3 seconds? Slower? Ah, of course, I was too used to `go` build tool that uses optimization by default, and forgot that C compilers don't do that.
211+
212+
```bash
213+
$ gcc -O3 main.c
214+
$ time ./a.out
215+
approximating pi with 1000000000 iterations.
216+
3.1415926545880506
217+
218+
real 0m6.434s
219+
user 0m6.427s
220+
sys 0m0.004s
221+
```
222+
223+
Nice, it's the same time as the Go and GopherJS versions. So that means a few things.
224+
225+
The V8 JavaScript Engine is incredible. It's able to take Go code that is compiled to JavaScript code, and just-in-time compile to it to machine instructions that are as efficient as the native Go compiler.
226+
227+
The JavaScript `Math.pow` implementation clearly has a fast-path for when value of x is -1 and values of y are integers. So for those inputs, it's faster. Using `Pow` with such inputs is silly and should not be done, as you can see by the 50 million to 1 billion iteration increase when rewriting it with an equivalent if statement. Go chooses not to have a fast path for such inputs, which makes it faster in the general case, and I'm glad it makes that choice.
228+
229+
You can try the final optimized version of GopherJS in your browser via the GopherJS Playground:
230+
231+
http://www.gopherjs.org/playground/#/sDEYM2TwC7
232+
233+
It's fascinating to think about what happens when you do that. The GopherJS compiler, written in pure Go, has compiled itself to JavaScript, which runs in your browser. That compiler takes your input Go program, compiles it to JavaScript and runs it. The V8 engine (or whatever JavaScript engine your browser uses) takes the generated JavaScript and JITs it to the equivalent machine code as produced by a low-level C implementation compiled with -O3, the max optimization setting.
234+
235+
There are still cases where the code GopherJS generates does not translate to something JS engines can optimize really well. For example, in [issue 276](https://github.com/gopherjs/gopherjs/issues/276), GopherJS version runs an unusually 1000x slower than native version. But I'm sure with some work, significant performance improvements can be made there, and in most other cases the performance is much better.
236+
237+
With the prospects of asm.js and the upcoming WebAssembly, I think there's a bright future for having Go language as a viable choice for the browser. I suggest you give it a try for your next little frontend project, or play with compiling any pure Go package to run in the browser. You may end up being pleasantly surprised, like I was.

0 commit comments

Comments
 (0)