Skip to content

A sampling function that returns top token probabilities #1784

Closed
@WangHaoranRobin

Description

@WangHaoranRobin

I was using playing around with the server example and wanted to expose the probabilities of the generated tokens to the server client to implement custom stopping sequences and criteria(similar to openai's api here).

All it would take should just be creating a different version of "llama_sample_token" and "llama_sample_token_greedy" that returns an object containing the top X tokens and their probabilities.

The only related issue/pr/discussion I was able to find is this pr about logging probabilities. Please give me pointers if similar requests have been discussed somewhere.

Since I'm relatively new to the repo, what is the protocol here? Should I just make a PR?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions