Skip to content

Conversation

@carlushuang
Copy link
Collaborator

[WIP]

carlushuang and others added 14 commits March 11, 2021 18:44
* add raw code to pta case

* fix bug for non-pta case

* add some tunables

* update all configs

* remove redundant print

* refactor driver

Co-authored-by: shaojiewang <wsjmessi@163.com>
* 256x8

* 512x8

* optimize double global prefetch

* 512x8x8

* 1024x8

* 512x4

* 256x4

* 128x4, 384x4

* 256x8x8

* int8x4

* 512x8 int8

* update 256x8x16, 512x16x8

* 1024x16x8

* 256x4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants