Skip to content

int8 output for seq embeddings #2316

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

YazhiGao
Copy link

@YazhiGao YazhiGao commented Feb 6, 2024

Summary:

  • int8 output dtype is a gap for recently fbgemm usage case, setup a reasonable refimplementation first, memcpy based.
  • for sequence embedding, we first unblock dispatch via simple memcpy, it is a pure bw op(no dequant) so memcpy should be reasonably ok. further optimization like ILP via unrolling, try avx non-temp instruction, rep instruction to be done in future iterations.

Differential Revision: D53449813

Copy link

netlify bot commented Feb 6, 2024

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
🔨 Latest commit 816faa1
🔍 Latest deploy log https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/65c2ac284d9c7c000829a7c7
😎 Deploy Preview https://deploy-preview-2316--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D53449813

YazhiGao pushed a commit to YazhiGao/FBGEMM that referenced this pull request Feb 6, 2024
Summary:

* int8 output dtype is a gap for recently fbgemm usage case, setup a reasonable refimplementation first, memcpy based.
* for sequence embedding, we first unblock dispatch via simple memcpy, it is a pure bw op(no dequant) so memcpy should be reasonably ok. further optimization like ILP via unrolling, try avx non-temp instruction, rep instruction to be done in future iterations.

Differential Revision: D53449813
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D53449813

YazhiGao pushed a commit to YazhiGao/FBGEMM that referenced this pull request Feb 6, 2024
Summary:

* int8 output dtype is a gap for recently fbgemm usage case, setup a reasonable refimplementation first, memcpy based.
* for sequence embedding, we first unblock dispatch via simple memcpy, it is a pure bw op(no dequant) so memcpy should be reasonably ok. further optimization like ILP via unrolling, try avx non-temp instruction, rep instruction to be done in future iterations.

Differential Revision: D53449813
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D53449813

YazhiGao pushed a commit to YazhiGao/FBGEMM that referenced this pull request Feb 6, 2024
Summary:

* int8 output dtype is a gap for recently fbgemm usage case, setup a reasonable refimplementation first, memcpy based.
* for sequence embedding, we first unblock dispatch via simple memcpy, it is a pure bw op(no dequant) so memcpy should be reasonably ok. further optimization like ILP via unrolling, try avx non-temp instruction, rep instruction to be done in future iterations.

Differential Revision: D53449813
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D53449813

Summary:

* int8 output dtype is a gap for recently fbgemm usage case, setup a reasonable refimplementation first, memcpy based.
* for sequence embedding, we first unblock dispatch via simple memcpy, it is a pure bw op(no dequant) so memcpy should be reasonably ok. further optimization like ILP via unrolling, try avx non-temp instruction, rep instruction to be done in future iterations.

Differential Revision: D53449813
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D53449813

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in af41af1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants