Skip to content

Refactor Mandelbrot benchmark #152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 11, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion examples/mandelbrot/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ build = "build.rs"
[dependencies]
packed_simd = { path = "../.." }
rayon = "^1.0"
time = "^0.1"
ispc = { version = "^0.3.6", optional = true }
structopt = "0.2.10"

[build-dependencies]
ispc = { version = "^0.3.6", optional = true }
Expand Down
7 changes: 3 additions & 4 deletions examples/mandelbrot/benchmark.sh
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,9 @@ if [[ "${NORUN}" == "1" ]]; then
exit 0
fi

hyperfine "target/release/mandelbrot ${WIDTH} ${HEIGHT} 0"
hyperfine "target/release/mandelbrot ${WIDTH} ${HEIGHT} 1"
hyperfine "target/release/mandelbrot ${WIDTH} ${HEIGHT} 2"
hyperfine "target/release/mandelbrot ${WIDTH} ${HEIGHT} --algo scalar"
hyperfine "target/release/mandelbrot ${WIDTH} ${HEIGHT} --algo simd"

if echo "$FEATURES" | grep -q "ispc"; then
hyperfine "target/release/mandelbrot ${WIDTH} ${HEIGHT} 3"
hyperfine "target/release/mandelbrot ${WIDTH} ${HEIGHT} --algo ispc"
fi
34 changes: 16 additions & 18 deletions examples/mandelbrot/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,21 +11,22 @@ http://mathworld.wolfram.com/MandelbrotSet.html
It takes four arguments in this order:

* `width`: width of the image to render
* (optional) `height`: height of the image to render - defaults to `width`
* (optional) `algorithm`: algorithm to use - defaults to the fastest one.
* `0`: scalar algorithm
* `1`: SIMD algorithm
* (optional) `output_format`: the output format to use - defaults to `PBM`
* `0`: PBM: Portable BitMap format (black & white output)
* `1`: PPM: Portable PixMap format (colored output)
* `height`: height of the image to render
* `algorithm`: algorithm to use:
* `scalar`: scalar algorithm
* `simd`: parallelized SIMD algorithm
* `ispc`: ISPC + tasks algorithm
* `--color` (optional): enables colorized output, which also determines the image format.
* disabled (default): PBM: Portable BitMap format (black & white output)
* enabled: PPM: Portable PixMap format (colored output)

The resulting image is piped to `stdout`.

`cargo run --release -- 400 > output.ppm` outputs:
`cargo run --release -- 400 400 --algo simd > output.ppm` outputs:

![run_400_png](https://user-images.githubusercontent.com/904614/43190942-72bdb834-8ffa-11e8-9dcf-a9a9632ae907.png)

`cargo run --release -- 400 400 1 1 > output.ppm` outputs:
`cargo run --release -- 400 400 --algo simd --color > output.ppm` outputs:

![run_400_400_1_1_png](https://user-images.githubusercontent.com/904614/43190948-759969a4-8ffa-11e8-81a9-35e5baef3e86.png)

Expand All @@ -40,32 +41,29 @@ On a dual core AVX1 i5 @1.8 GHz:
| 800 x 800 | time [ms] <br> Rust | speedup vs `scalar` [-] |
|------------|---------------------|-------------|
| `scalar` | 86.6 | 1.0x |
| `simd` | 46.2 | 1.9x |
| `par_simd` | 21.0 | 4.1x |
| `simd` | 21.0 | 4.1x |
| `ispc` | 25.7 | 3.4x |

`par_simd` algorithm is ~1.2x faster than `ispc`.
`simd` algorithm is ~1.2x faster than `ispc`.

On a 28 core Xeon CPU E5-2690 v4 @ 2.60GHz:

| 800 x 800 | time [ms] <br> Rust | speedup vs `scalar` [-] |
|------------|---------------------|-------------------------|
| `scalar` | 50.8 | 1.0x |
| `simd` | 34.8 | 1.5x |
| `par_simd` | 25.1 | 2x |
| `simd` | 25.1 | 2x |
| `ispc` | 14.4 | 3.52x |

`par_simd` algorithm is ~1.74x slower than `ispc`.
`simd` algorithm is ~1.74x slower than `ispc`.

On a 40 core Xeon Gold 6148 CPU @ 2.40GHz:

| 800 x 800 | time [ms] <br> Rust | speedup vs `scalar` [-] |
|------------|---------------------|-------------|
| `scalar` | 59.9 | 1.0x |
| `simd` | 46.6 | 1.3x |
| `par_simd` | 29.9 | 2.0x |
| `simd` | 29.9 | 2.0x |
| `ispc` | 30.3 | 2.0x |

`par_simd` algorithm is as fast as `ispc`.
`simd` algorithm is as fast as `ispc`.

[bg]: https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/mandelbrot.html#mandelbrot
32 changes: 0 additions & 32 deletions examples/mandelbrot/src/ispc_.rs

This file was deleted.

30 changes: 30 additions & 0 deletions examples/mandelbrot/src/ispc_tasks.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
//! Includes the ISPC implementations.
use *;

ispc_module!(mandelbrot);

pub fn generate(dims: Dimensions, xr: Range, yr: Range) -> Vec<u32> {
let (width, height) = dims;
let Range { start: left, end: right } = xr;
let Range { start: top, end: bottom } = yr;

let len = width * height;
let mut out = Vec::with_capacity(len);

unsafe {
mandelbrot::mandelbrot_ispc(
left,
bottom,
right,
top,
height as i32,
width as i32,
ITER_LIMIT as i32,
out.as_mut_ptr() as *mut i32,
);

out.set_len(len);
}

out
}
Loading