Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Add reduction overloads accepting span #6019

Merged
merged 22 commits into from
May 2, 2022

Commits on Apr 18, 2022

  1. [SYCL] Add reduction overloads accepting span

    Signed-off-by: John Pennycook <john.pennycook@intel.com>
    Pennycook committed Apr 18, 2022
    Configuration menu
    Copy the full SHA
    c0de8c6 View commit details
    Browse the repository at this point in the history
  2. [SYCL] Add specialization of reducer for spans

    Splits the functionality of a regular reducer across two classes:
    
    1) reducer contains the work-item's private data and exposes the
       subscript operator ([]) to access an individual element.
    
    2) reducer_element contains a pointer to one element from the
       reducer, and exposes the regular combine interface.
    
    Signed-off-by: John Pennycook <john.pennycook@intel.com>
    Pennycook committed Apr 18, 2022
    Configuration menu
    Copy the full SHA
    1642d2f View commit details
    Browse the repository at this point in the history
  3. [SYCL] Add specialization of reduction_impl

    The specialization allows the type of the reduction (span<T, Extent>)
    to be separate from the type of the span (T). T is used to determine
    the reduction algorithm that should be used, and to create temporary
    storage.
    
    A new static member function "num_elements" is added to all reduction_impl
    specializations to distinguish between scalar reductions and array reductions.
    A scalar reduction (all existing implementations of reduction_impl) always
    has (num_elements == 1); an array reduction (this new specialization) has
    (num_elements == Extent), where Extent is the extent of the span.
    
    Signed-off-by: John Pennycook <john.pennycook@intel.com>
    Pennycook committed Apr 18, 2022
    Configuration menu
    Copy the full SHA
    86e9ef5 View commit details
    Browse the repository at this point in the history
  4. [SYCL] Enable span reductions

    Each of the existing reduction implementations (for a single reduction object)
    can be extended to support spans by looping over the number of elements in the
    reduction.
    
    If (num_elements == 1), the loop has a single iteration and degenerates to the
    behavior of the reduction implementation prior to this commit. If
    (num_elements > 1), the loop iterates over each reduction element in turn.
    
    Note that the getElement() function allows the scalar and array reduction
    implementations to be the same without specializing for either case, and
    allowing difference in storage (a single T vs an array of Ts). This is
    especially convenient because a scalar reduction is equivalent to an array
    reduction with a single element.
    
    If (num_elements > 1), the implementation currently reduces each element
    separately. This allows array reductions to use the same amount of work-group
    local memory as a scalar reduction using the same T, but at the expense of
    additional synchronization calls.
    
    Signed-off-by: John Pennycook <john.pennycook@intel.com>
    Pennycook committed Apr 18, 2022
    Configuration menu
    Copy the full SHA
    c0e0cec View commit details
    Browse the repository at this point in the history
  5. [SYCL] Disable span reductions before C++17

    Necessary because span.hpp is only available >= C++17.
    
    Signed-off-by: John Pennycook <john.pennycook@intel.com>
    Pennycook committed Apr 18, 2022
    Configuration menu
    Copy the full SHA
    1c97a16 View commit details
    Browse the repository at this point in the history

Commits on Apr 21, 2022

  1. Remove unused argument

    Co-authored-by: Vyacheslav Klochkov <vyacheslav.n.klochkov@intel.com>
    Pennycook and v-klochkov authored Apr 21, 2022
    Configuration menu
    Copy the full SHA
    807440c View commit details
    Browse the repository at this point in the history
  2. [SYCL][NFC] Refactor to avoid code duplication

    This commit refactors the reduction implementation to avoid code duplication,
    and additionally makes a number of changes to the class hierarchy for future
    extensibility.
    
    All functionality and member variables expected to be common to all reductions
    is moved to a common base class (reduction_impl_common). The existing
    reduction_impl_base is unsuitable for this purpose because it was deliberately
    designed not to be a template class.
    
    The reduction_impl is now templated on a reduction algorithm, with any
    functionality related to the current (default) algorithm encapsulated in the
    default_reduction_algorithm class. This template is carried from the reduction
    to any reducers it creates, enabling future specialization of both reduction
    and reducer for interesting combinations of type, extent and properties.
    
    The reducer class is simplified using CRTP to avoid duplicate definitions of
    combine() and atomic_combine() for scalar and array reductions.
    
    The notion of a reducer's "dimensionality" is now tied to the dimensionality of
    the reduction being performed (i.e. 0 for scalars, 1 for spans) and not to the
    dimensionality of the input accessor/buffer. This will simplify extending
    reductions to true multi-dimensional array reductions (i.e. with md_span).
    
    Signed-off-by: John Pennycook <john.pennycook@intel.com>
    Pennycook committed Apr 21, 2022
    Configuration menu
    Copy the full SHA
    111ea2f View commit details
    Browse the repository at this point in the history

Commits on Apr 26, 2022

  1. [SYCL] Fix bug introduced during refactor

    The definition of atomic_combine() did not correctly compute the offsets for
    reductions with Extent > 1. This was hidden by a bug in the associated tests.
    
    Signed-off-by: John Pennycook <john.pennycook@intel.com>
    Pennycook committed Apr 26, 2022
    Configuration menu
    Copy the full SHA
    f672248 View commit details
    Browse the repository at this point in the history
  2. [SYCL] Add span support to packs of reductions

    Scalar and array (span) reductions require different algorithms,
    preventing the original scalar reduction code from being generalized to cover
    spans. Specifically, the original scalar reduction code assumes that each stage
    of a reduction (initializing reducers, combining reducers in local memory,
    writing reducers back to global memory) can be applied to all reductions in a
    pack at the same time. This is incompatible with the current implementation
    of array reductions, which iterates over all elements of a span in turn at each
    stage.
    
    The implementation introduced in this commit filters the scalar and array
    reductions from the reduction pack and handles them separately. All scalar
    reductions are handled in parallel (as before), while each array reduction
    is handled separately.
    
    In future, the code handling the reduction packs could be generalized to
    handle more reduction/reducer types while making less assumptions about
    the reduction algorithm.
    
    Signed-off-by: John Pennycook <john.pennycook@intel.com>
    Pennycook committed Apr 26, 2022
    Configuration menu
    Copy the full SHA
    5207769 View commit details
    Browse the repository at this point in the history

Commits on Apr 27, 2022

  1. [SYCL] Generalize esimd_verify test for reducers

    Previously assumed specific reducer implementation and template arguments.
    
    Signed-off-by: John Pennycook <john.pennycook@intel.com>
    Pennycook committed Apr 27, 2022
    Configuration menu
    Copy the full SHA
    cab1b2f View commit details
    Browse the repository at this point in the history

Commits on Apr 29, 2022

  1. Remove incorrect comment

    Co-authored-by: Vyacheslav Klochkov <vyacheslav.n.klochkov@intel.com>
    Pennycook and v-klochkov authored Apr 29, 2022
    Configuration menu
    Copy the full SHA
    a9d18cc View commit details
    Browse the repository at this point in the history
  2. Remove unused argument

    Co-authored-by: Vyacheslav Klochkov <vyacheslav.n.klochkov@intel.com>
    Pennycook and v-klochkov authored Apr 29, 2022
    Configuration menu
    Copy the full SHA
    0236111 View commit details
    Browse the repository at this point in the history
  3. Remove unused arguments

    Co-authored-by: Vyacheslav Klochkov <vyacheslav.n.klochkov@intel.com>
    Pennycook and v-klochkov authored Apr 29, 2022
    Configuration menu
    Copy the full SHA
    7608d97 View commit details
    Browse the repository at this point in the history
  4. Clarify comment describing View template parameter

    Signed-off-by: John Pennycook <john.pennycook@intel.com>
    Pennycook committed Apr 29, 2022
    Configuration menu
    Copy the full SHA
    ca998d7 View commit details
    Browse the repository at this point in the history
  5. Fix comment

    Co-authored-by: Vyacheslav Klochkov <vyacheslav.n.klochkov@intel.com>
    Pennycook and v-klochkov authored Apr 29, 2022
    Configuration menu
    Copy the full SHA
    af22456 View commit details
    Browse the repository at this point in the history
  6. Add comment to first reducer specialization

    Signed-off-by: John Pennycook <john.pennycook@intel.com>
    Pennycook committed Apr 29, 2022
    Configuration menu
    Copy the full SHA
    657ff53 View commit details
    Browse the repository at this point in the history
  7. Remove redundant SFINAE from getIdentity()

    Signed-off-by: John Pennycook <john.pennycook@intel.com>
    Pennycook committed Apr 29, 2022
    Configuration menu
    Copy the full SHA
    9553b7f View commit details
    Browse the repository at this point in the history
  8. clang-format

    Pennycook committed Apr 29, 2022
    Configuration menu
    Copy the full SHA
    12056fc View commit details
    Browse the repository at this point in the history
  9. [SYCL] Revert to sycl::detail::tuple

    std::tuple is still used for reducers (as in the original code).
    
    Signed-off-by: John Pennycook <john.pennycook@intel.com>
    Pennycook committed Apr 29, 2022
    Configuration menu
    Copy the full SHA
    cb030f2 View commit details
    Browse the repository at this point in the history
  10. Remove old TODO

    Signed-off-by: John Pennycook <john.pennycook@intel.com>
    Pennycook committed Apr 29, 2022
    Configuration menu
    Copy the full SHA
    c2663db View commit details
    Browse the repository at this point in the history
  11. Remove commented out code

    Signed-off-by: John Pennycook <john.pennycook@intel.com>
    Pennycook committed Apr 29, 2022
    Configuration menu
    Copy the full SHA
    90deb60 View commit details
    Browse the repository at this point in the history
  12. Fix IsArrayReduction

    span<T, 1> is still an array reduction.
    
    Signed-off-by: John Pennycook <john.pennycook@intel.com>
    Pennycook committed Apr 29, 2022
    Configuration menu
    Copy the full SHA
    6843a67 View commit details
    Browse the repository at this point in the history