Skip to content

Conversation

@copybara-service
Copy link
Contributor

Enable 8x8c4 neondot qc2w gemm microkernel

  • Now 8x8 90 instruction main loop. 64 are sdot.
  • Was 4x16 117 instruction main loop. 64 are sdot.

Generate all sizes from 1x8 to 8x16 to evaluate best gemm performance.

- Now 8x8 90 instruction main loop.  64 are sdot.
- Was 4x16 117 instruction main loop.  64 are sdot.

Generate all sizes from 1x8 to 8x16 to evaluate best gemm performance.

PiperOrigin-RevId: 851489942
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant