Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Loops] Adds occa::forLoop #465

Merged
merged 13 commits into from
Jan 20, 2021
Merged

[Loops] Adds occa::forLoop #465

merged 13 commits into from
Jan 20, 2021

Conversation

dmed256
Copy link
Member

@dmed256 dmed256 commented Jan 19, 2021

Description

Similar to occa::array, occa::forLoop allows us to build for-loop kernels inline. This allows injecting arguments along with adding compile-time defines.

  occa::scope scope({
    {"output", output}
  }, {
    {"defines/length", length}
  });

  occa::forLoop()
    .outer(2)
    .inner(length)
    .run(OCCA_FUNCTION(scope, [=](const int outerIndex, const int innerIndex) -> void {
      const int globalIndex = outerIndex + (2 * innerIndex);
      output[globalIndex] = globalIndex;
    }));

Iterators

.outer() and .inner() support 1, 2, or 3 arguments which can be of types:

  • int N which generates a loop between [0, N)
  • occa::range which generates a loop given the range start, end, and step definition
  • occa::array<int> which iterates through the indices of the array

For-loop body

Based on the .outer and .inner argument counts, the for-loop body will expect a lambda with the proper types

A few examples:

  • .outer(N) -> [=](const int outerIndex) -> void {}
  • .outer(N, N) -> [=](const int2 outerIndex) -> void {}
  • .outer(N, N).inner(N) -> [=](const int2 outerIndex, const int innerIndex) -> void {}
  • .outer(N).inner(N, N) -> [=](const int outerIndex, const int2 innerIndex) -> void {}

If the input types don't correspond with the .outer().inner() definitions, the compiler will complain:


@.outer-only loops

The use of @shared memory can be crucial for some implementations. Because of this, we easily support @shared memory by automating only the @outer loop generation and leaving the @inner loop implementations to the user.

Note the weird usage of OKL("...");. This is used to inject source-code due to compiler restrictions:

  1. The lambda is actually compiled, so it must be valid C++11. This means OKL attributes can't be placed inline.
  2. The source-code is loaded by stringifying the lambda. Unfortunately the preprocessor doesn't keep the newlines so OKL(<source-code>) can be used to bypass this issue. This is useful to setup directives, such as #if / #endif.
  occa::forLoop()
    .outer(length)
    .run(OCCA_FUNCTION(scope, [=](const int outerIndex) -> void {
      OKL("@shared"); int array[2];

      OKL("@inner");
      for (int i = 0; i < 2; ++i) {
        array[i] = i;
      }

      OKL("@inner");
      for (int i = 0; i < 2; ++i) {
        output[i] = array[1 - i];
      }
    }));

TODOs

  • Handle const on scope injected arguments properly

  • Handle occa::int# in OKL to avoid having to use

    using int2 = occa::int2;
    using int3 = occa::int3;

Resolves #462

@dmed256 dmed256 force-pushed the loops/adds-for-loop branch from fac6ac3 to fc0c80c Compare January 20, 2021 23:43
@codecov
Copy link

codecov bot commented Jan 20, 2021

Codecov Report

Merging #465 (fc0c80c) into main (98b856c) will increase coverage by 0.11%.
The diff coverage is 79.56%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #465      +/-   ##
==========================================
+ Coverage   75.25%   75.36%   +0.11%     
==========================================
  Files         254      260       +6     
  Lines       19396    19605     +209     
==========================================
+ Hits        14596    14775     +179     
- Misses       4800     4830      +30     
Impacted Files Coverage Δ
include/occa/functional/baseFunction.hpp 0.00% <0.00%> (ø)
include/occa/functional/function.hpp 69.23% <ø> (ø)
include/occa/functional/scope.hpp 90.90% <ø> (+18.18%) ⬆️
include/occa/functional/utils.hpp 46.42% <ø> (ø)
include/occa/types/json.hpp 77.18% <ø> (ø)
src/functional/scope.cpp 72.72% <33.33%> (-2.75%) ⬇️
include/occa/functional/array.hpp 88.04% <55.88%> (-1.96%) ⬇️
src/occa/internal/lang/specialMacros.cpp 60.26% <62.85%> (+0.78%) ⬆️
src/functional/range.cpp 90.27% <66.66%> (-9.73%) ⬇️
src/loops/iteration.cpp 79.16% <79.16%> (ø)
... and 25 more

@dmed256 dmed256 marked this pull request as ready for review January 20, 2021 23:52
@dmed256 dmed256 merged commit c8dd65a into main Jan 20, 2021
@dmed256 dmed256 deleted the loops/adds-for-loop branch January 20, 2021 23:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add occa::forLoop
1 participant