Track and format chat history #9

rlouf · 2024-07-31T09:46:22Z

Given the multiplicity of formats, formatting the prompt for chat workflows with open models can be a real hassle and is error-prone. In this PR we introduce a Chat class that allows users to track the conversation and easily print the corresponding prompt.

Closes #4

We can initialize a chat session with an optional system prompt:

from prompts import Chat

session = Chat("You are a nice assistant")

This system prompt can always be updated later on:

session = Chat("You are a nice assistant")
session.system = "You are an evil assistant"

We can add user and assistant messages to the session:

from prompts import Message

chat += Message("user", "Hi!")
chat += Message("assistant", "Hi!")

Adding two successive user or assistant messages will raise an exception. To prevent user errors we will create two types User and Assistant to replace Message("user" and Message("assistant":

chat += User("hi!")
chat += Assistant("hi!")

We can also easily "fork" discussions:

chat = Chat()
...
new_chat = chat + Message("user", "Mew")

And finally we can render the conversation using model's chat template with

chat.render("model")

Note that before rendering, the render method calls an internal filter method that can trim the discussion (pass through by default). To implement a memory-less agent you would subclass Chat this way:

class MarkovianChat(Chat):

    def filter(self):
        return self.history[-1]

For vision model we can re-use the Vision objects we defined in the new version of Outlines:

chat += user(Vision("prompt", image))

TODO

Tests
Add user and assistant aliases
Templated discussion

We should download the relevant files from HF. I don't think we can avoid implementing the Jinja2 templates for each model family though. Would need to use regular expressions instead of full names (might be slow).

Given the multiplicity of formats, formatting the prompt for chat workflows with open models can be a real hassle and is error-prone. In this PR we introduce a `Chat` class that allows users to track the conversation and easily print the corresponding prompt.

cpfiffer · 2024-09-27T21:12:02Z

Jan does this over in the Julia world with PromptingTools.jl, and it's extremely ergonomic. PromptingTools has one of the best interfaces I've seen, so big fan of this.

https://github.com/svilupp/PromptingTools.jl

willkurt · 2024-09-27T21:45:51Z

It's useful to compare this to AutoTokenizer.apply_chat_template which currently just takes a list of dicts, but accomplishes the same thing without needing to learn a new class.

The biggest issue I probably have with apply_chat_template is that each model has different rules. So for example your point about: "Adding two successive user or assistant messages will raise an exception." only applies to Mistral as far as I know (or at least it doesn't apply to every instruct setup). Phi-3 seems to ignore system prompts (but I can only tell this by printing the prompt out), etc.

I do see some advantages in this proposed approach, namely:

I can never remember the roles, so having them be properties rather than strings is nice.
- Especially good if some roles aren't supported (or in the possible case of additional roles)
Certainly is less verbose than manually typing dicts

Additionally I do like the template example you have in chat. However it seems to me that templating and constructing prompts themselves should be independent from managing the chat/instruct interface?

Now if the Chat class was initialized with the model (rather than render) we could enforce these rules when messages are appended (as proposed in that example) which is much nicer than only getting that feed back when you pass in all the messages. Especially in a notebook environment it would be very helpful to have something like:

chat = Chat('phi-3...')
chat.append(roles.system())
>> Sorry, Phi-3 does not allow system prompts, this will be ignored.

chat = Chat('mistral-7b...')
chat.append(roles.user())
chat.append(roles.user())
>>> Sorry, Mistral requires the roles to alternate between 'assistant' and 'user'

Of course this diminishes the generality of the Chat as well. Ideally I'm assuming we would like an instance of Chat to be rendered however we like.

Basically the tl;dr here is we should make sure that the value add is pretty strong to justify the user switching from a very well known and well supported data structure (list of dicts) to our custom class.

rlouf · 2024-09-29T12:25:52Z

Just to give some context, the interface was designed so rendering can be done automatically in outlines:

from prompts import Chat, user
from outlines import models

openai = models.openai("gpt-4o")

chat = Chat()
chat.system = "You are an AI"
chat += user("Hi!")

result = openai(chat)

Where

chat.render("openai")

is called by the openai function.

rlouf added the enhancement New feature or request label Jul 31, 2024

rlouf self-assigned this Jul 31, 2024

rlouf force-pushed the add-chat-history branch 2 times, most recently from 148c274 to b65dc33 Compare September 27, 2024 17:01

Add special tokens for models

e91d0b4

We should download the relevant files from HF. I don't think we can avoid implementing the Jinja2 templates for each model family though. Would need to use regular expressions instead of full names (might be slow).

rlouf force-pushed the add-chat-history branch 2 times, most recently from 307f662 to 65546fe Compare September 27, 2024 17:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track and format chat history #9

Track and format chat history #9

rlouf commented Jul 31, 2024 •

edited

Loading

cpfiffer commented Sep 27, 2024

willkurt commented Sep 27, 2024

rlouf commented Sep 29, 2024 •

edited

Loading

Track and format chat history #9

Are you sure you want to change the base?

Track and format chat history #9

Conversation

rlouf commented Jul 31, 2024 • edited Loading

TODO

cpfiffer commented Sep 27, 2024

willkurt commented Sep 27, 2024

rlouf commented Sep 29, 2024 • edited Loading

rlouf commented Jul 31, 2024 •

edited

Loading

rlouf commented Sep 29, 2024 •

edited

Loading