Skip to content

Running AI inference of phi3 and other llms from c# using NPU + GPU in comming processors? #7162

Open

Description

Intel, AMD, Qualqomm, etc are getting powerful NPUs (+40TOPS) for inferencing.

Is there any plan to incluide in ml.net functionality to be able to run and inference these models easily from C# offloading to NPU or GPU or both. Next Intel processors will have 40TOPS NPU and 60TOPS CPU/GPU.

How from C# can we easily make the most and inference using all of these TOPS coming from NPU + GPU?

All samples i see about this require using python etc, would be great to have all this available in .NET C# directly.

Maybe including some C# wrapper around https://github.com/intel/intel-npu-acceleration-library but what about AMD and qualcomm?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Deep LearningHardware SupportIssues and requests related to CPU / GPU / NPUNLPIssues / questions around text processingenhancementNew feature or requestuntriagedNew issue has not been triaged

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions