Skip to content

Conversation

@firecoperana
Copy link
Collaborator

I test this using the example in vllm-project/vllm#11368 and it looks ok.

@saood06
Copy link
Collaborator

saood06 commented Jun 10, 2025

This already looks so much better than #504 just from looking at how much more similar it is to the reference implementation.

It was taking time testing that because it looked like it had a lot of edge cases that would lead to issues or at least some incorrect behavior.

add_executable(${TARGET} rpc-server.cpp)
target_link_libraries(${TARGET} PRIVATE ggml)
target_compile_features(${TARGET} PRIVATE cxx_std_17)
target_compile_features(${TARGET} PRIVATE cxx_std_17)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in the mainline file.

SERVER_VERBOSE=$<BOOL:${LLAMA_SERVER_VERBOSE}>
)

if (MSVC)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the stack size code, add_tensor function in ggml-rpc.cpp is using recursion to serialize the graph. Windows has very small stack size by default, so it is easy to cause stack overflow if graph is too complex. This is not needed for dry sampler, but a bug fix for rpc.

src/llama.cpp Outdated
}

struct llama_sampler_dry * llama_sampler_init_dry(const struct llama_model* model, float dry_multiplier, float dry_base, int32_t dry_allowed_length, int32_t dry_penalty_last_n, const char** seq_breakers, size_t num_breakers) {
return llama_sampler_init_dry_impl(model->vocab, llama_n_ctx_train(model), dry_multiplier, dry_base, dry_allowed_length, dry_penalty_last_n, seq_breakers, num_breakers);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DRY sampler only depends on the vocabulary, not the entire model. Wouldn't it have been better to define the interface that way (taking a pointer to vocabulary instead of model)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change it.

@ikawrakow
Copy link
Owner

@saood06 Any other comments?

@saood06
Copy link
Collaborator

saood06 commented Jun 11, 2025

Tried to build this to test and got this:

/ik_llama.cpp/src/../include/llama.h:1240:54: error: unknown type name ‘llama_sampler_dry’
 1240 |     void llama_sample_dry(struct llama_context* ctx, llama_sampler_dry* smpl, llama_token_data_array* candidates_p);
      |                                                      ^~~~~~~~~~~~~~~~~

@firecoperana
Copy link
Collaborator Author

Tried to build this to test and got this:

/ik_llama.cpp/src/../include/llama.h:1240:54: error: unknown type name ‘llama_sampler_dry’
 1240 |     void llama_sample_dry(struct llama_context* ctx, llama_sampler_dry* smpl, llama_token_data_array* candidates_p);
      |                                                      ^~~~~~~~~~~~~~~~~

Can you clean the build folder and try again? It compiles fine for me.
Build command I use.
cmake -B build -DGGML_CUDA=ON -DGGML_BLAS=OFF -DCMAKE_BUILD_TYPE=Release -DLLAMA_BUILD_TESTS=OFF -DLLAMA_BUILD_EXAMPLES=ON -DLLAMA_BUILD_SERVER=ON -DLLAMA_CURL=OFF -DBUILD_SHARED_LIBS=ON -DGGML_SCHED_MAX_COPIES=1

@saood06
Copy link
Collaborator

saood06 commented Jun 12, 2025

Can you clean the build folder and try again?

This was with a clean build folder.

It compiles fine for me. Build command I use. cmake -B build -DGGML_CUDA=ON -DGGML_BLAS=OFF -DCMAKE_BUILD_TYPE=Release -DLLAMA_BUILD_TESTS=OFF -DLLAMA_BUILD_EXAMPLES=ON -DLLAMA_BUILD_SERVER=ON -DLLAMA_CURL=OFF -DBUILD_SHARED_LIBS=ON -DGGML_SCHED_MAX_COPIES=1

Maybe it is because you set -DLLAMA_BUILD_TESTS=OFF, sorry I should have given you more of the compile error log.

In file included from /home/saood06/ik_main/ik_llama.cpp/tests/test-c.c:1:
/home/saood06/ik_main/ik_llama.cpp/src/../include/llama.h:1240:54: error: unknown type name ‘llama_sampler_dry’
 1240 |     void llama_sample_dry(struct llama_context* ctx, llama_sampler_dry * smpl, llama_token_data_array* candidates_p);
      |                                                      ^~~~~~~~~~~~~~~~~
gmake[2]: *** [tests/CMakeFiles/test-c.dir/build.make:79: tests/CMakeFiles/test-c.dir/test-c.c.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:2688: tests/CMakeFiles/test-c.dir/all] Error 2
gmake[1]: *** Waiting for unfinished jobs....

@firecoperana
Copy link
Collaborator Author

Can you clean the build folder and try again?

This was with a clean build folder.

It compiles fine for me. Build command I use. cmake -B build -DGGML_CUDA=ON -DGGML_BLAS=OFF -DCMAKE_BUILD_TYPE=Release -DLLAMA_BUILD_TESTS=OFF -DLLAMA_BUILD_EXAMPLES=ON -DLLAMA_BUILD_SERVER=ON -DLLAMA_CURL=OFF -DBUILD_SHARED_LIBS=ON -DGGML_SCHED_MAX_COPIES=1

Maybe it is because you set -DLLAMA_BUILD_TESTS=OFF, sorry I should have given you more of the compile error log.

In file included from /home/saood06/ik_main/ik_llama.cpp/tests/test-c.c:1:
/home/saood06/ik_main/ik_llama.cpp/src/../include/llama.h:1240:54: error: unknown type name ‘llama_sampler_dry’
 1240 |     void llama_sample_dry(struct llama_context* ctx, llama_sampler_dry * smpl, llama_token_data_array* candidates_p);
      |                                                      ^~~~~~~~~~~~~~~~~
gmake[2]: *** [tests/CMakeFiles/test-c.dir/build.make:79: tests/CMakeFiles/test-c.dir/test-c.c.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:2688: tests/CMakeFiles/test-c.dir/all] Error 2
gmake[1]: *** Waiting for unfinished jobs....

Should be good this time.

@ikawrakow ikawrakow merged commit 3f111ad into ikawrakow:main Jun 19, 2025
sayap added a commit to sayap/ik_llama.cpp that referenced this pull request Sep 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants