Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test-backend-ops : add performance eval mode + improve CUDA repeat and binary broadcast ops performance #636

Merged
merged 18 commits into from
Dec 7, 2023

Commits on Dec 5, 2023

  1. Configuration menu
    Copy the full SHA
    4dd5370 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    0866d84 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    2699dd7 View commit details
    Browse the repository at this point in the history
  4. improve formatting

    slaren committed Dec 5, 2023
    Configuration menu
    Copy the full SHA
    1f6c60d View commit details
    Browse the repository at this point in the history

Commits on Dec 6, 2023

  1. add sd test cases

    slaren committed Dec 6, 2023
    Configuration menu
    Copy the full SHA
    2ffe1a2 View commit details
    Browse the repository at this point in the history
  2. fix test case

    slaren committed Dec 6, 2023
    Configuration menu
    Copy the full SHA
    5c74195 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    a5d7a1d View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    69d844c View commit details
    Browse the repository at this point in the history
  5. cleanup

    slaren committed Dec 6, 2023
    Configuration menu
    Copy the full SHA
    8eb145a View commit details
    Browse the repository at this point in the history
  6. typo

    slaren committed Dec 6, 2023
    Configuration menu
    Copy the full SHA
    171a091 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    cca2854 View commit details
    Browse the repository at this point in the history
  8. metal : mul mat-vec support dim3 broadcasts

    ggml-ci
    ggerganov committed Dec 6, 2023
    Configuration menu
    Copy the full SHA
    3ebcec1 View commit details
    Browse the repository at this point in the history
  9. ggml-cuda : fix bin_bcast for ne0=1

    ggml-ci
    slaren committed Dec 6, 2023
    Configuration menu
    Copy the full SHA
    45fd1a2 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    d5f2fc9 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    da6ca50 View commit details
    Browse the repository at this point in the history

Commits on Dec 7, 2023

  1. Configuration menu
    Copy the full SHA
    ebdc505 View commit details
    Browse the repository at this point in the history
  2. ggml-cuda : bin_bcast: collapse dimensions when possible, add fallbac…

    …k kernel for large tensors
    
    ggml-ci
    slaren committed Dec 7, 2023
    Configuration menu
    Copy the full SHA
    fcdaaa2 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    8ad1e85 View commit details
    Browse the repository at this point in the history