Skip to content

Key-value-only GGUFs to help with unit testing models #4720

Closed
@postmasters

Description

@postmasters

Feature Description

GGUF files with all the key-value pairs, including tensor descriptions but not tensor data to help unit test model loading and other tasks that do not involve real tensor data.

When real tensor data are required, perhaps some kind of deterministically auto-generated tensors could be used so that and end-to-end inference results can be checked too.

Motivation

llama.cpp has supported many more models nowadays. It is simply not possible for contributors to have access to, nor resources (disk space, RAM, etc) to try out all models. But to add more confidence that any changes made to llama.cpp is still backward compatible with existing models, perhaps all pull requests could still at least pass tests involving just the metadata of a model (such as the key-value portion in GGUF, and the tensor metadata, but not tensor actual data).

This is based on a backward compat issue in PR #4657.

Possible Implementation

Some tasks, such as loading hparams, creating tensor maps, building computation graphs do not require actual tensor data. With some effort, they may be refactored to be tested with key-value-only GGUFs.

Some other tasks where real tensor data are required, perhaps they can be filled in deterministically (random with pre-determined seed?) and broadcasted throughout all dimensions so that the unit tests don't require actual tensor data, don't take much memory, but can still complete end-to-end inference, albeit with not great output. Then unit tests can do "with this prompt, assert the output is ..."

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions