Skip to content

[SYCL] Weird performance behaviour #813

Closed
@cagnulein

Description

@cagnulein

Hi, i'm playing around SYCL and i'm facing a really strange performance behaviour on a little script.

Let's start form this little script:

#include <CL/sycl.hpp>
#include <iostream>
#include <math.h>
#include <chrono>

#define IMAGE_WIDTH     (20000L)
#define IMAGE_HEIGHT    (40000L)
#define IMAGE_SIZE      (IMAGE_WIDTH*IMAGE_HEIGHT)

unsigned char* old_image;
namespace sycl = cl::sycl;

int main(int argc, char *argv[]) {
    old_image = new unsigned char[IMAGE_SIZE];
    {
        sycl::queue myQueue(sycl::gpu_selector{});
        sycl::buffer<unsigned char, 1> inputBuf(old_image, sycl::range<1>(IMAGE_WIDTH*IMAGE_HEIGHT));
        myQueue.submit([&](sycl::handler& cgh) {
            auto readImage = inputBuf.get_access<sycl::access::mode::read>(cgh);
            cgh.parallel_for<class simple_test>(sycl::range<1>(1821303172), [=](sycl::id<1> idx) {
            });
        });
    }
    return 0;
}

This script runs in
sysele@sysele-C08:~/work/sycl/rotate$ time ./rotate.gpu

real 0m0.978s
user 0m0.172s
sys 0m0.084s

If i change the iterations of the parallel_for, decreasing them, putting 1761303172 instead of 1821303172, the timings change in these:

sysele@sysele-C08:~/work/sycl/rotate$ time ./rotate.gpu

real 0m4.541s
user 0m0.190s
sys 0m0.065s

Any explanations? The timings are deterministic.
I'm using a Intel(R) Core(TM) i7-7820EQ CPU @ 3.00GHz with the latest SYCL GIT.
I've tried oclcpuexp-2019.9.11.0.1106_rel.tar.gz and oclcpuexp-2019.8.8.0.0822_rel.tar.gz both and all these assets have the issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    StalebugSomething isn't workingperformancePerformance related issues

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions