Skip to content

Commit

Permalink
cuda : get_row_rounding F32 (ggerganov#4095)
Browse files Browse the repository at this point in the history
* Fix ggerganov#4017

* Update ggml-cuda.cu

Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>

* Update ggml-cuda.cu

Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>

---------

Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
  • Loading branch information
2 people authored and olexiyb committed Nov 23, 2023
1 parent 2abce2c commit 11649d4
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions ggml-cuda.cu
Original file line number Diff line number Diff line change
Expand Up @@ -6356,6 +6356,7 @@ static int64_t get_row_rounding(ggml_type type) {
case GGML_TYPE_Q8_0:
return max_compute_capability >= CC_RDNA2 ? 128 : 64;
case GGML_TYPE_F16:
case GGML_TYPE_F32:
return 1;
case GGML_TYPE_Q2_K:
return max_compute_capability >= CC_RDNA2 ? 128 : 32;
Expand All @@ -6378,6 +6379,7 @@ static int64_t get_row_rounding(ggml_type type) {
case GGML_TYPE_Q8_0:
return 64;
case GGML_TYPE_F16:
case GGML_TYPE_F32:
return 1;
case GGML_TYPE_Q2_K:
case GGML_TYPE_Q3_K:
Expand Down

0 comments on commit 11649d4

Please sign in to comment.