-
Notifications
You must be signed in to change notification settings - Fork 871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to use a prompt for text analysis? #145
Comments
Hi, okay so it seems there was a misunderstanding on how the prompt works. The Chat Template you are using is supposed to work like this:
To better understand it, see it like this:
I hope this answers your doupts ! |
@pandora-s-git Thanks for your answer! But still I do not fully understand the usage of I want to explain what I do so far. My goal is to analyze and extract content from business documents. And Mistral-7B-Instruct did a really good job so far! My First Prompt template looks like this:
And
Would you say that the placement of I then generate much more complex prompts from the first prompt result using much more additional information and I useing few-shot learning. But this is hard to construct when you do not understand how mistral-7b treats the different kind of prompt tokens. For example: is it allowed to use |
This is close, by string it's like a full completion. Let me try to rexplain what everything means. As you may know, LLMs are Text Completion Machines, so they only complete text. We fine tunned them to complete dialogs so we can chat with them, these are the Instruct versions. Basically we need to train them with something similar to a chat like:
But to make the key words more unique, we rennamed them with specific strings, here [INST] defines the users instruction, and [/INST] defines the end of the instruction, and so the assistants response. However, the model needs a token that allows us to know when its finished, cause if not it will just continue trying to complete the text, thats the EOS token (End Of String), and we also have one for the beginning (BOS). Here the BOS is I think its better for ur use case to have something like this:
The model will then complete and finish with a EOS (if it doesnt maybe by default ur code removes it). Basically if you have a dialog with an LLM it should look like this:
There are chat templates you can use directly, Its better to directly use the jinja template that its available in the tokenizer for example. ( Check the repo 'mistral-common', should help you ) |
Ok, Thanks a lot! This helps me really to understand mistral-7b-instruct much better. And now I beginn to make progress :-) |
I'm glad you liked it, here is a chat template that may be of use for u:
If you dont know how to understand this, check how to use a jinja template and you should find some clues. |
I just want to share my final prompt templates that I use now to summarize the content of complex business documents. Categorize DocumentFirst I use a template to just categorize the content in a company name and a language
Summarize DocumentNext I use the following template to summarize the data of the document.
The results are very good now. Of course, every word in the instruction counts! |
I am still confused how to use the prompt within
Mistral-7b-instruct
if I want to analyze the content of a text, such as summary or categorization of the context.So in my prompt I have a text and an instruction. I want to extract information form the given text. And it is not this often discussed chat-use-case.
I am currently using the following Prompt Template for my approach:
Is this a correct way to build a prompt for text analyses with
Mistral-7b-instruct
? Or should I separate the context and the instruction in some other way? Note, I also do not use the EOS here. Is this a problem?I ask this because the results vary if I put the context before or after the instruction and I did not find a guide line for such a scenario.
I am also referring to this official documentation page which is hard to understand:
What does this sentence mean? What is the difference between a string and a regular string?
Thanks for any tips.
The text was updated successfully, but these errors were encountered: