Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add occa::forLoop #462

Closed
dmed256 opened this issue Jan 19, 2021 · 0 comments · Fixed by #465
Closed

Add occa::forLoop #462

dmed256 opened this issue Jan 19, 2021 · 0 comments · Fixed by #465
Labels
feature Use this label to request a new feature!
Milestone

Comments

@dmed256
Copy link
Member

dmed256 commented Jan 19, 2021

API

Basic Example

Here we generate a for-loop that goes through [0, N) and tiled by tileSize

occa::forLoop()
  .outer(occa::range(0, N, tileSize))
  .inner(tileSize)
  .run(scope, OCCA_FUNCTION([=](const int outerIndex, const int innerIndex) -> void {
    // ...
  }));

Indices + Multiple Dimensions

We give an example where an index array is passed rather than a simple occa::range
Additionally, this @inner loop has 2 dimensions so the expected OCCA_FUNCTION should be taking in an int2 for the inner indices

occa::array<int> indices;
// ...
occa::forLoop()
  .outer(indices)
  .inner(X, Y)
  .run(scope, OCCA_FUNCTION([=](const int outerIndex, const int2 innerIndex) -> void {
    // ...
  }));

Implementation

It would be nice to template the forLoop<outerSize, innerSize> to avoid code repetition, but there are 2 not-so-great UX issues:

  • Template errors would be hard to decipher
  • No easy way to set intTuple<outerSize> -> int / int2 / int3
  • No way to enforce the only-outer or outer+inner loops when calling run that I'm aware of

We can also add tile methods to make it simple to generate the outer/inner loop combinations

class outerForLoop<O> : public outerForLoop {
  occa::iteration outerIterations[<O>];
  
  class innerForLoop<I> : public innerForLoop {
    occa::iteration innerIterations[<I>];

    void run(occa::function<void(int<O>, int<I>)> fn);

    void run(occa::scope scope,
             occa::function<void(int<O>, int<I>)> fn);
  };
    
  innerForLoop1 inner(occa::iteration iteration1);
  
  innerForLoop2 inner(occa::iteration iteration1,
                      occa::iteration iteration2);
  
  innerForLoop3 inner(occa::iteration iteration1,
                      occa::iteration iteration2,
                      occa::iteration iteration3);

  void run(occa::function<void(int<O>)> fn);

  void run(occa::scope scope,
           occa::function<void(int<O>)> fn);
};  

class forLoop {
  forLoop();
  
  outerForLoop1 outer(occa::iteration iteration1);
  
  outerForLoop2 outer(occa::iteration iteration1,
                      occa::iteration iteration2);
  
  outerForLoop3 outer(occa::iteration iteration1,
                      occa::iteration iteration2,
                      occa::iteration iteration3);
};
@dmed256 dmed256 added the feature Use this label to request a new feature! label Jan 19, 2021
@dmed256 dmed256 added this to the v1.2.0 milestone Jan 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Use this label to request a new feature!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant