Skip to content

Relationship to llama.cpp #10

Closed
Closed
@dokterbob

Description

@dokterbob

First of all: CONGRATS ON YOUR AMAZING RESEARCH WORK.

Considering that this is using GGML and seems based directly on llama.cpp:

Why is this a separate project to llama.cpp, given that llama.cpp already supports BitNet ternary quants? (ggml-org/llama.cpp#8151)

Are these simply more optimised kernels?
If so, how do they compare to llama's implementation?
Can/should they be contributed back to llama.cpp?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions