Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: update use llama-server instead #34

Merged
merged 2 commits into from
May 6, 2024
Merged

docs: update use llama-server instead #34

merged 2 commits into from
May 6, 2024

Conversation

swarnimarun
Copy link
Contributor

I created llama-server image over the weekend, it's very small, simple and entirely static.

CPU image is, 68MB compressed and 250MB uncompressed.
GPU CUDA image is, 1.5GB compressed and 4GB uncompressed. (depending on platform)

There is also a test intel image which is very huge but supports additional optimizations for Intel devices GPUs and CPUs both.

No python or interpreted languages here, also it uses the .cache directory, as long as its volume mounted you can easily cache it.

@swarnimarun swarnimarun self-assigned this Apr 29, 2024
I created llama-server image over the weekend, it's very small, simple and entirely
static.

No python or interpreted lanugages here, also it uses the .cache dir, as long as it's volume mounted you can easily cache it.
docs/guides/langchain.md Outdated Show resolved Hide resolved
docs/guides/langchain.md Outdated Show resolved Hide resolved
@swarnimarun swarnimarun merged commit ee76147 into premAI-io:main May 6, 2024
5 checks passed
@swarnimarun swarnimarun deleted the docs-fix branch May 6, 2024 05:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants