Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update build.yml, enable rpc for windows cuda builds #9184

Merged
merged 1 commit into from
Sep 5, 2024

Conversation

awatuna
Copy link
Contributor

@awatuna awatuna commented Aug 26, 2024

enable rpc for windows cuda builds

The current windows cuda build's llama-server ignores --rpc arguments, and it does not build cuda enabled rpc-server.

I have tested several models, with 1 llama-server and 1 remote rpc-server, most models work, only one mixtral q4km quant crashes with

ggml/src/ggml-rpc.cpp:893: GGML_ASSERT(tensor->data + tensor_size >= tensor->data) failed

I think it should be fine to build rpc enabled cuda llama-server and cuda enabled rpc-server by default.

build rpc-server for windows cuda
@github-actions github-actions bot added the devops improvements to build systems and github actions label Aug 26, 2024
@mofosyne mofosyne added the Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix label Aug 30, 2024
@rgerganov
Copy link
Collaborator

Looks good to me. Also feel free to file an issue with the model which didn't work with the RPC backend.

@slaren slaren merged commit 32b2ec8 into ggerganov:master Sep 5, 2024
48 checks passed
dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this pull request Oct 29, 2024
build rpc-server for windows cuda
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops improvements to build systems and github actions Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants