Add logits processor #176

rlouf · 2024-01-11T09:16:57Z

Feature request

Add the possibility for users to specify a function that processes the logits before sampling. This function would be called right before next_token_chooser.

Motivation

I am a maintainer of Outlines a library for guided generation: regex, JSON, grammar, etc. A user recently asked if we could integrate with LoRaX, and I think it would be beneficial for both libraries.

We can then discuss a potential deeper integration (allow function calling via JSON-guided generation), but this feature request is minimally intrusive and low maintenance.

Your contribution

I could help with submitting a PR.

The text was updated successfully, but these errors were encountered:

tgaddair · 2024-01-12T01:15:24Z

Hey @rlouf, thanks for reaching out and offering to help with the PR! Outlines is an awesome project, and one that's definitely been on my radar, so excited for this.

Can you help me understand the interaction between Outlines and the LLM server (like LoRAX)? One way I could imagine this working is:

User provides a JSON schema to LoRAX as a request parameter (similar to the vLLM example here):

curl http://0.0.0.1:8000 \
    -d '{
        "inputs": "What is the capital of France?",
        "parameters": {
            "schema": {"type": "string"}
          }
        }'

In the backend, LoRAX executes some custom Outlines code that warps the logits just prior to the next_token_chooser call here.

Is that the right way to think about this v1 integration, or is there a different way to approach this?

We can certainly add a generic logit processor interface, but I wanted to first make sure I understand how it will be used so we can get the interface right.

tgaddair · 2024-01-12T23:10:53Z

Looks like we could implement something similar to: https://github.com/outlines-dev/outlines/blob/main/outlines/serve/vllm.py

I could se two routes:

We hardcode this implementation into LoRAX, so there is native support for Outlines (just requires including outlines as a dep, I imagine).
We provide a generic interface to process logits and allow the user to provide a path to a file to be loaded on initialization containing the implementation. This would work, but could get tricky with depndencie and create security vulnerabilities from remote code execution.

brucethemoose · 2024-01-15T22:57:37Z

+1, I am extremely interested in this.

For reference, vllm applies logit processing here: https://github.com/vllm-project/vllm/blob/2a18da257ccd0d5beafcebe93246e4e220c88a12/vllm/model_executor/layers/sampler.py#L155

As for 1 vs 2... why not both? How about making Outlines work out-of-the-box, with zero extra configuration, as a "default" logit processing function. Or structure it as an optional dependency for those that don't want the processing, if that's desirable.

Then allow users to supply their own logit processing function as well, perhaps hidden behind a launch flag with a warning and a very minimal DIY interface.

Also, there is a performance concern here. I'm not familiar with Lorax's code at all, but heavy grammar processing can become a single-core performance bottleneck in some implementations.

rlouf · 2024-01-18T22:53:10Z

We can certainly add a generic logit processor interface, but I wanted to first make sure I understand how it will be used so we can get the interface right.

This looks like a reasonable approach to me

AdithyanI · 2024-01-24T11:00:31Z

@tgaddair not to rush or push. Just wanted to check in if this is in the roadmap :)
Just asking so I can align my internal teams roadmap accordingly.
Also happy to pitch in with any help if required.

tgaddair · 2024-01-24T17:11:11Z

Hey @AdithyanI, definitely still on the roadmap! We should have capacity to take this on our side in about 2 weeks if that works, but also open to contributions if someone wants to take this on sooner.

From the discussion, it sounds like the main items needed are:

Custom logit processor interface and a list of logit processors as an atrribute of the Model class
Ability to load logit processors from a file during LoRAX initialization (on startup)
Ability to specify generation constraints in the request that trigger the logit processors (for example --schema param that triggers JSON enforcement through an Outlines logit processor)

tgaddair · 2024-02-05T23:26:56Z

Draft PR is up in #224. @jeffreyftang will be picking up the work from here to test it out and resolve any issues (of which there are likely to be many, haha). But my hope is we can land this some time this week.

tgaddair · 2024-02-12T23:40:02Z

@rlouf @AdithyanI @brucethemoose this has landed, please feel free to try it out. We're going to follow-up with some official docs, but the general usage is:

REST:

schema: "<string containing valid JSON schema>"

Python:

schema=<dict containing json schema>

tgaddair · 2024-02-12T23:40:27Z

The prebuild Docker image now comes with Outlines preinstalled as well.

rlouf · 2024-02-13T08:08:23Z

This is awesome!

tgaddair · 2024-02-13T22:57:05Z

Documentation up here: https://predibase.github.io/lorax/guides/guided_generation/

cc @AdithyanI @brucethemoose

Next up will be to support OpenAI API and Pydantic schemas in Python client.

cc @jeffreyftang

jeffreyftang · 2024-02-15T17:48:48Z

@rlouf I was taking a stab at enabling CFG generation with Outlines (using grammars.json), but consistently got garbage output (usually something like 1. followed by a bunch of newlines or similar).

I also tried running the CFG example from the Outline readme directly and things just seemed to hang.

Looks like there are quite a few open issues related to grammar in the Outlines repo; just curious if you have some insight as to what might be going wrong here. Thanks!

brucethemoose · 2024-02-16T20:18:14Z

Lark grammar is somewhat "dangerous," certain configs can make Outlines (and other implementations) hang. Same goes for llama.cpp's grammar.

It also kinda has to "agree" with the prompt and model so it has some valid logits to choose from.

@jeffreyftang Do you have an example of the exact grammar file/string you used?

And yeah, Outlines may be having some CFG issues on top of that, but I am not up to speed.

jeffreyftang · 2024-02-16T21:23:17Z

@brucethemoose the one where it hung was actually the arithmetic example from the outlines readme.

tgaddair added the enhancement New feature or request label Jan 11, 2024

tgaddair mentioned this issue Jan 12, 2024

Project Roadmap #57

Open

36 tasks

tgaddair assigned jeffreyftang Jan 30, 2024

tgaddair mentioned this issue Feb 5, 2024

Added Outlines logits processor for JSON schema validation #224

Merged

tgaddair closed this as completed in #224 Feb 12, 2024

jeffreyftang mentioned this issue Feb 15, 2024

enh: Enable JSON guided generation via OpenAI-compatible API #243

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add logits processor #176

Add logits processor #176

rlouf commented Jan 11, 2024

tgaddair commented Jan 12, 2024

tgaddair commented Jan 12, 2024

brucethemoose commented Jan 15, 2024 •

edited

Loading

rlouf commented Jan 18, 2024

AdithyanI commented Jan 24, 2024

tgaddair commented Jan 24, 2024

tgaddair commented Feb 5, 2024

tgaddair commented Feb 12, 2024

tgaddair commented Feb 12, 2024

rlouf commented Feb 13, 2024

tgaddair commented Feb 13, 2024

jeffreyftang commented Feb 15, 2024

brucethemoose commented Feb 16, 2024 •

edited

Loading

jeffreyftang commented Feb 16, 2024

Add logits processor #176

Add logits processor #176

Comments

rlouf commented Jan 11, 2024

Feature request

Motivation

Your contribution

tgaddair commented Jan 12, 2024

tgaddair commented Jan 12, 2024

brucethemoose commented Jan 15, 2024 • edited Loading

rlouf commented Jan 18, 2024

AdithyanI commented Jan 24, 2024

tgaddair commented Jan 24, 2024

tgaddair commented Feb 5, 2024

tgaddair commented Feb 12, 2024

tgaddair commented Feb 12, 2024

rlouf commented Feb 13, 2024

tgaddair commented Feb 13, 2024

jeffreyftang commented Feb 15, 2024

brucethemoose commented Feb 16, 2024 • edited Loading

jeffreyftang commented Feb 16, 2024

brucethemoose commented Jan 15, 2024 •

edited

Loading

brucethemoose commented Feb 16, 2024 •

edited

Loading