Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] get MAX_MEM_ALLOC from device property #5270

Merged
merged 4 commits into from
Feb 2, 2024

Conversation

airMeng
Copy link
Collaborator

@airMeng airMeng commented Feb 2, 2024

Thanks remindings from @slaren and @0cc4m , fix #5250

Note:
Limited max memory allocation size will cause slight performance regression. If you are using an Intel Data Center GPU like Intel GPU Max series(codename "ponte vecchio"), I will suggest you to follow "Allocations greater than 4GB" to remove the limit.

@Jacoby1218
Copy link

can confirm, this fixed #5250

@airMeng
Copy link
Collaborator Author

airMeng commented Feb 2, 2024

@ggerganov @slaren seems the macos building failures are not related, could you give a review?

@NeoZhangJianyu
Copy link
Collaborator

NeoZhangJianyu commented Feb 2, 2024

@ggerganov @slaren seems the macos building failures are not related, could you give a review?

I see the fault CI has nothing with changed code.
Is it possible to ignore this fault CI in the PR review?

Copy link
Contributor

@luoyu-intel luoyu-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@airMeng
Copy link
Collaborator Author

airMeng commented Feb 2, 2024

@ggerganov could you add a label called "Intel GPU" like AMD and NV has, then we can quickly response to related issues/PR?

@ggerganov
Copy link
Owner

@NeoZhangJianyu @airMeng As long as PRs make changes only to the SYCL code, you can merge at your discretion

@NeoZhangJianyu
Copy link
Collaborator

@NeoZhangJianyu @airMeng As long as PRs make changes only to the SYCL code, you can merge at your discretion

@ggerganov We have no merge access. Could you assign the access to us?

@ggerganov
Copy link
Owner

You need to accept the collaborator invite

@airMeng airMeng merged commit e805f0f into ggerganov:master Feb 2, 2024
53 checks passed
@airMeng airMeng deleted the sycl_fix_max_alloc_size branch February 2, 2024 07:54
@NeoZhangJianyu
Copy link
Collaborator

You need to accept the collaborator invite

Yes, we see it. Thank you! :)

@characharm
Copy link

Unfortunately, with this fix, the token generation speed became the same as Vulkan.

@airMeng
Copy link
Collaborator Author

airMeng commented Feb 2, 2024

Unfortunately, with this fix, the token generation speed became the same as Vulkan.

@characharm could you paste the details in #5277, your HW, SW, OS, models and performance numbers and we can see whether the performance are reasonable?

jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Feb 3, 2024
* get max alloc size from device prop

* fix macro typo
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024
* get max alloc size from device prop

* fix macro typo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SYCL incoherent output on >4GB allocations of GPU memory
6 participants