Feature Request: Installable package via winget #8188

ngxson · 2024-06-28T13:27:20Z

Prerequisites

I am running the latest code. Mention the version if possible as well.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

On macos/linux, user can install a pre-built version llama.cpp easily via brew

It would be nice to have the equivalent to that on windows, via winget

Motivation

The pre-built binary is already available via releases: https://github.com/ggerganov/llama.cpp/releases

It would be nice to somehow push them to https://winget.run/

However, I'm not familiar with working on windows, so I create this issue to further discuss and to look for help from the community.

Possible Implementation

No response

The text was updated successfully, but these errors were encountered:

slaren · 2024-06-28T17:26:39Z

The macOS situation is a bit easier to handle because there is only the Metal backend, and we know that it is available on every device. For windows and linux we would need different builds for CPU AVX, CPU AVX2, CPU AVX512, CUDA, HIP, SYCL, Vulkan,.. which is not very user friendly to say the least.

gabrielgrant · 2024-09-26T20:38:13Z

@slaren there are already builds for all those different compute acceleration options. I agree that choosing which backend to install is rather confusing. But I'm not understanding why making the existing builds available via winget would make things more confusing?

max-krasnyansky · 2024-09-26T20:56:10Z

Ah. I didn't see this issue earlier. I kind of started looking into this already as well.
I wanted to publish winget packages for Windows on ARM64 (Snapdragon X-Elite) and a decently optimized x86-64 build.
For ARM64 we can just enable the CPU (ARMv8.7 NEON with MatMul INT8) for now. For x86-64 I was thinking of also publishing CPU with upto AVX512 for now.
Basically, at least enable folks to easily install a usable version, and they are looking for best perf they can install stuff from our releases directly.
I don't know how well winget handles package flavors yet, perhaps there is a way to publish multiple packages for the same arch and have it pick one based on the machine details.
If winget itself is not great at it we can add some wrapper scripts that fetch better versions.

slaren · 2024-09-26T20:58:04Z

I don't think that uploading a dozen different packages to winget is going to be a good experience for the users. It's barely tolerable as it is on the github releases, and here we assume that the users will be somehow technically competent enough to choose the right version. I would expect a winget package to be easier to use than what we can currently offer.

gabrielgrant · 2024-09-26T21:05:46Z

@slaren gotcha

@max-krasnyansky having a script to algorithmically determine which build is most appropriate would be great (whether for winget or even just for determining which of the builds on GH will run on a given machine)

max-krasnyansky · 2024-09-26T21:08:16Z

Sorry, if I wasn't clear. My plan was to publish decent CPU versions to start so that simple winget install llama.cpp works.
Users get usable version with basically zero effort.

Then we can look into either having winget "auto-select" a better optimized version based on the user's machine
ie I'm definitely not suggesting user to do something like winget install llama.cpp-cuda-v3.1-avx512 :)

If winget itself can't do that (I need to look a bit deeper into their package metadata format and options) then we can figure out something else. Maybe our own script, etc.

AndreasKunar · 2024-10-08T11:44:26Z

@max-krasnyansky :
Ollama v0.3.12 supports winget install and it now also works great / native on my Snapdragon X Elite Surface Laptop 7 on Windows (for ARM). I did not look into the details, but it might be a good starting-point (since it builds on top of the llama.cpp base).

max-krasnyansky · 2024-10-08T19:28:07Z

@max-krasnyansky : Ollama v0.3.12 supports winget install and it now also works great / native on my Snapdragon X Elite Surface Laptop 7 on Windows (for ARM). I did not look into the details, but it might be a good starting-point (since it builds on top of the llama.cpp base).

Yep. I saw that they have the winget package. I thought it was still x86-64 build though. Will take a look at the latest.

ngxson added enhancement New feature or request help wanted Extra attention is needed labels Jun 28, 2024

ngxson mentioned this issue Jun 28, 2024

Proposal: Improve llama.cpp snippet huggingface/huggingface.js#778

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Installable package via winget #8188

Feature Request: Installable package via winget #8188

ngxson commented Jun 28, 2024

slaren commented Jun 28, 2024

gabrielgrant commented Sep 26, 2024 •

edited

Loading

max-krasnyansky commented Sep 26, 2024

slaren commented Sep 26, 2024

gabrielgrant commented Sep 26, 2024

max-krasnyansky commented Sep 26, 2024

AndreasKunar commented Oct 8, 2024

max-krasnyansky commented Oct 8, 2024

Feature Request: Installable package via winget #8188

Feature Request: Installable package via winget #8188

Comments

ngxson commented Jun 28, 2024

Prerequisites

Feature Description

Motivation

Possible Implementation

slaren commented Jun 28, 2024

gabrielgrant commented Sep 26, 2024 • edited Loading

max-krasnyansky commented Sep 26, 2024

slaren commented Sep 26, 2024

gabrielgrant commented Sep 26, 2024

max-krasnyansky commented Sep 26, 2024

AndreasKunar commented Oct 8, 2024

max-krasnyansky commented Oct 8, 2024

gabrielgrant commented Sep 26, 2024 •

edited

Loading