Skip to content

Commit

Permalink
[Inference] Qwen2 support fp8 inference (#8954)
Browse files Browse the repository at this point in the history
* qwen2 fp8

* fp8 check

* fp8 cutlass

* int8 cachekv

* a8w8c8_fp8
  • Loading branch information
ckl117 authored Sep 2, 2024
1 parent a275ab7 commit 84469d6
Show file tree
Hide file tree
Showing 2 changed files with 451 additions and 55 deletions.
Loading

0 comments on commit 84469d6

Please sign in to comment.