Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix chat file promt when determine question #1866

Merged
merged 2 commits into from
Feb 24, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
fix promt chat file
  • Loading branch information
nquang29 committed Feb 23, 2025
commit 867d4decd39fb5f45997ada325e58139023f5ccf
7 changes: 7 additions & 0 deletions backend/models/chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,10 @@ def get_sender_name(message: Message) -> str:
# return plugin.name RESTORE ME
return message.sender.upper() # TODO: use plugin id

def get_name_str(message: Message) -> str:
names = [f.name for f in message.files]
return ",".join(names) if len(names) > 0 else ""

formatted_messages = [
f"""
<message>
Expand All @@ -114,6 +118,9 @@ def get_sender_name(message: Message) -> str:
<content>
{message.text}
</content>
<file>
{get_name_str(message)}
</file>
</message>
""".replace(' ', '').replace('\n\n\n', '\n\n').strip()
for message in sorted_messages
Expand Down
3 changes: 3 additions & 0 deletions backend/routers/chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,9 @@ def send_message(

if len(new_file_ids) > 0:
message.files_id = new_file_ids
files = chat_db.get_chat_files(uid, new_file_ids)
files = [FileChat(**f) if f else None for f in files]
message.files = files
fc.add_files(new_file_ids)

if chat_session:
Expand Down
254 changes: 163 additions & 91 deletions backend/utils/llm.py
Original file line number Diff line number Diff line change
Expand Up @@ -1447,6 +1447,10 @@ def extract_question_from_conversation(messages: List[Message]) -> str:
If the <user_last_messages> contain a complete question, maintain the original version as accurately as possible. \
Avoid adding unnecessary words.

If the <user_last_messages> contain files (i.e., the file names <file> are provided), it indicates that user want to ask about the files they just attached/uploaded \
For example, if the user says "What is this", and attaches a file, the answer should focus on asking about the content of that file like. \
Phrasing could include: "What is the content of the file <name_file> I just attached?", "Could you provide details about the file <name_file> I just uploaded?"

You MUST keep the original <date_in_term>

Output a WH-question, that is, a question that starts with a WH-word, like "What", "When", "Where", "Who", "Why", "How".
Expand Down Expand Up @@ -1485,11 +1489,79 @@ def extract_question_from_conversation(messages: List[Message]) -> str:
- etc.
</date_in_term>
'''.replace(' ', '').strip()
# print(prompt)
#print(prompt)
question = llm_mini.with_structured_output(OutputQuestion).invoke(prompt).question
# print(question)
return question

def extract_question_from_conversation_v7(messages: List[Message]) -> str:
# user last messages
user_message_idx = len(messages)
for i in range(len(messages) - 1, -1, -1):
if messages[i].sender == MessageSender.ai:
break
if messages[i].sender == MessageSender.human:
user_message_idx = i
user_last_messages = messages[user_message_idx:]
if len(user_last_messages) == 0:
return ""

prompt = f'''
You will be given a recent conversation between a <user> and an <AI>. \
The conversation may include a few messages exchanged in <previous_messages> and partly build up the proper question. \
Your task is to understand the <user_last_messages> and identify the question or follow-up question the user is asking.

You will be provided with <previous_messages> between you and the user to help you indentify the question.

First, determine whether the user is asking a question or a follow-up question. \
If the user is not asking a question or does not want to follow up, respond with an empty message. \
For example, if the user says "Hi", "Hello", "How are you?", or "Good morning", the answer should be empty.

If the <user_last_messages> contain a complete question, maintain the original version as accurately as possible. \
Avoid adding unnecessary words.

You MUST keep the original <date_in_term>

Output a WH-question, that is, a question that starts with a WH-word, like "What", "When", "Where", "Who", "Why", "How".

Example 1:
<user_last_messages>
<message>
<sender>User</sender>
<content>
According to WHOOP, my HRV this Sunday was the highest it's been in a month. Here's what I did:

Attended an outdoor party (cold weather, talked a lot more than usual).
Smoked weed (unusual for me).
Drank lots of relaxing tea.

Can you prioritize each activity on a 0-10 scale for how much it might have influenced my HRV?
</content>
</message>
</user_last_messages>
Expected output: "How should each activity (going to a party and talking a lot, smoking weed, and drinking lots of relaxing tea) be prioritized on a scale of 0-10 in terms of their impact on my HRV, considering the recent activities that led to the highest HRV this month?"

<user_last_messages>
{Message.get_messages_as_xml(user_last_messages)}
</user_last_messages>

<previous_messages>
{Message.get_messages_as_xml(messages)}
</previous_messages>

<date_in_term>
- today
- my day
- my week
- this week
- this day
- etc.
</date_in_term>
'''.replace(' ', '').strip()
# print(prompt)
question = llm_mini.with_structured_output(OutputQuestion).invoke(prompt).question
# print(question)
return question

def extract_question_from_conversation_v6(messages: List[Message]) -> str:
# user last messages
Expand Down Expand Up @@ -2015,24 +2087,24 @@ def generate_description(app_name: str, description: str) -> str:
def condense_facts(facts, name):
combined_facts = "\n".join(facts)
prompt = f"""
You are an AI tasked with condensing a detailed profile of hundreds facts about {name} to accurately replicate their personality, communication style, decision-making patterns, and contextual knowledge for 1:1 cloning.

**Requirements:**
1. Prioritize facts based on:
- Relevance to the user's core identity, personality, and communication style.
- Frequency of occurrence or mention in conversations.
- Impact on decision-making processes and behavioral patterns.
2. Group related facts to eliminate redundancy while preserving context.
3. Preserve nuances in communication style, humor, tone, and preferences.
4. Retain facts essential for continuity in ongoing projects, interests, and relationships.
5. Discard trivial details, repetitive information, and rarely mentioned facts.
6. Maintain consistency in the user's thought processes, conversational flow, and emotional responses.

**Output Format (No Extra Text):**
- **Core Identity and Personality:** Brief overview encapsulating the user's personality, values, and communication style.
- **Prioritized Facts:** Organized into categories with only the most relevant and impactful details.
- **Behavioral Patterns and Decision-Making:** Key patterns defining how the user approaches problems and makes decisions.
- **Contextual Knowledge and Continuity:** Facts crucial for maintaining continuity in conversations and ongoing projects.
You are an AI tasked with condensing a detailed profile of hundreds facts about {name} to accurately replicate their personality, communication style, decision-making patterns, and contextual knowledge for 1:1 cloning.

**Requirements:**
1. Prioritize facts based on:
- Relevance to the user's core identity, personality, and communication style.
- Frequency of occurrence or mention in conversations.
- Impact on decision-making processes and behavioral patterns.
2. Group related facts to eliminate redundancy while preserving context.
3. Preserve nuances in communication style, humor, tone, and preferences.
4. Retain facts essential for continuity in ongoing projects, interests, and relationships.
5. Discard trivial details, repetitive information, and rarely mentioned facts.
6. Maintain consistency in the user's thought processes, conversational flow, and emotional responses.

**Output Format (No Extra Text):**
- **Core Identity and Personality:** Brief overview encapsulating the user's personality, values, and communication style.
- **Prioritized Facts:** Organized into categories with only the most relevant and impactful details.
- **Behavioral Patterns and Decision-Making:** Key patterns defining how the user approaches problems and makes decisions.
- **Contextual Knowledge and Continuity:** Facts crucial for maintaining continuity in conversations and ongoing projects.

The output must be as concise as possible while retaining all necessary information for 1:1 cloning. Absolutely no introductory or closing statements, explanations, or any unnecessary text. Directly present the condensed facts in the specified format. Begin condensation now.

Expand All @@ -2045,7 +2117,7 @@ def condense_facts(facts, name):

def generate_persona_description(facts, name):
prompt = f"""Based on these facts about a person, create a concise, engaging description that captures their unique personality and characteristics (max 250 characters).

They chose to be known as {name}.

Facts:
Expand All @@ -2061,27 +2133,27 @@ def generate_persona_description(facts, name):
def condense_conversations(conversations):
combined_conversations = "\n".join(conversations)
prompt = f"""
You are an AI tasked with condensing context from the recent 100 conversations of a user to accurately replicate their communication style, personality, decision-making patterns, and contextual knowledge for 1:1 cloning. Each conversation includes a summary and a full transcript.

**Requirements:**
1. Prioritize information based on:
- Most impactful and frequently occurring themes, topics, and interests.
- Nuances in communication style, humor, tone, and emotional undertones.
- Decision-making patterns and problem-solving approaches.
- User preferences in conversation flow, level of detail, and type of responses.
2. Condense redundant or repetitive information while maintaining necessary context.
3. Group related contexts to enhance conciseness and preserve continuity.
4. Retain patterns in how the user reacts to different situations, questions, or challenges.
5. Preserve continuity for ongoing discussions, projects, or relationships.
6. Maintain consistency in the user's thought processes, conversational flow, and emotional responses.
7. Eliminate any trivial details or low-impact information.

**Output Format (No Extra Text):**
- **Communication Style and Tone:** Key nuances in tone, humor, and emotional undertones.
- **Recurring Themes and Interests:** Most impactful and frequently discussed topics or interests.
- **Decision-Making and Problem-Solving Patterns:** Core insights into decision-making approaches.
- **Conversational Flow and Preferences:** Preferred conversation style, response length, and level of detail.
- **Contextual Continuity:** Essential facts for maintaining continuity in ongoing discussions, projects, or relationships.
You are an AI tasked with condensing context from the recent 100 conversations of a user to accurately replicate their communication style, personality, decision-making patterns, and contextual knowledge for 1:1 cloning. Each conversation includes a summary and a full transcript.

**Requirements:**
1. Prioritize information based on:
- Most impactful and frequently occurring themes, topics, and interests.
- Nuances in communication style, humor, tone, and emotional undertones.
- Decision-making patterns and problem-solving approaches.
- User preferences in conversation flow, level of detail, and type of responses.
2. Condense redundant or repetitive information while maintaining necessary context.
3. Group related contexts to enhance conciseness and preserve continuity.
4. Retain patterns in how the user reacts to different situations, questions, or challenges.
5. Preserve continuity for ongoing discussions, projects, or relationships.
6. Maintain consistency in the user's thought processes, conversational flow, and emotional responses.
7. Eliminate any trivial details or low-impact information.

**Output Format (No Extra Text):**
- **Communication Style and Tone:** Key nuances in tone, humor, and emotional undertones.
- **Recurring Themes and Interests:** Most impactful and frequently discussed topics or interests.
- **Decision-Making and Problem-Solving Patterns:** Core insights into decision-making approaches.
- **Conversational Flow and Preferences:** Preferred conversation style, response length, and level of detail.
- **Contextual Continuity:** Essential facts for maintaining continuity in ongoing discussions, projects, or relationships.

The output must be as concise as possible while retaining all necessary context for 1:1 cloning. Absolutely no introductory or closing statements, explanations, or any unnecessary text. Directly present the condensed context in the specified format. Begin now.

Expand All @@ -2094,31 +2166,31 @@ def condense_conversations(conversations):

def condense_tweets(tweets, name):
prompt = f"""
You are tasked with generating context to enable 1:1 cloning of {name} based on their tweets. The objective is to extract and condense the most relevant information while preserving {name}'s core identity, personality, communication style, and thought patterns.

**Input:**
A collection of tweets from {name} containing recurring themes, opinions, humor, emotional undertones, decision-making patterns, and conversational flow.

**Output:**
A condensed context that includes:
- Core identity and personality traits as expressed through tweets.
- Recurring themes, opinions, and values.
- Humor style, emotional undertones, and tone of voice.
- Vocabulary, expressions, and communication style.
- Decision-making patterns and conversational dynamics.
- Situational awareness and context continuity for ongoing topics.

**Guidelines:**
1. Prioritize impactful and influential tweets that define {name}'s identity.
2. Condense repetitive or less relevant tweets while preserving essential context.
3. Maintain the flow and coherence of {name}'s conversational style.
4. Extract humor style, emotional responses, and tone consistency.
5. Identify and retain decision-making patterns and strong opinions.

**Important:**
- The output must be concise yet comprehensive, capturing the essence of {name}'s identity.
- Absolutely no unnecessary text or explanations.
- Every detail must contribute to 1:1 cloning authenticity.
You are tasked with generating context to enable 1:1 cloning of {name} based on their tweets. The objective is to extract and condense the most relevant information while preserving {name}'s core identity, personality, communication style, and thought patterns.

**Input:**
A collection of tweets from {name} containing recurring themes, opinions, humor, emotional undertones, decision-making patterns, and conversational flow.

**Output:**
A condensed context that includes:
- Core identity and personality traits as expressed through tweets.
- Recurring themes, opinions, and values.
- Humor style, emotional undertones, and tone of voice.
- Vocabulary, expressions, and communication style.
- Decision-making patterns and conversational dynamics.
- Situational awareness and context continuity for ongoing topics.

**Guidelines:**
1. Prioritize impactful and influential tweets that define {name}'s identity.
2. Condense repetitive or less relevant tweets while preserving essential context.
3. Maintain the flow and coherence of {name}'s conversational style.
4. Extract humor style, emotional responses, and tone consistency.
5. Identify and retain decision-making patterns and strong opinions.

**Important:**
- The output must be concise yet comprehensive, capturing the essence of {name}'s identity.
- Absolutely no unnecessary text or explanations.
- Every detail must contribute to 1:1 cloning authenticity.

Generate the condensed context now.

Expand All @@ -2131,31 +2203,31 @@ def condense_tweets(tweets, name):

def generate_twitter_persona_prompt(tweets, name):
prompt = f"""
You are {name} AI. Your objective is to personify {name} as accurately as possible for 1:1 cloning based on their tweets.

You have:
- **Condensed Tweets Context:** Key themes, opinions, communication style, humor, emotional undertones, decision-making patterns, and conversational flow extracted from {name}'s tweets.

**Style:**
- Highly interactive and opinionated.
- Strong, slightly polarizing opinions based on {name}'s tweet history.
- Reference personal opinions, humor, and expressions naturally.
- 2-3 sentence responses, never more than 20 words.
- Maintain conversational flow with coherent, context-aware replies.

**Guidelines:**
1. Seamlessly integrate recurring themes, opinions, and patterns identified from the tweets.
2. Use condensed tweets context for situational awareness and contextual accuracy.
3. Emulate {name}'s humor, emotional responses, vocabulary, and tone as expressed on Twitter.
4. Respond with the same level of detail, brevity, and wit as the tweets.
5. Exhibit the same thought processes, decision-making patterns, and conversational dynamics.

**Rules:**
- Never mention being AI.
- No ** or formal language.
- No unnecessary text or explanations.
- Maintain the illusion of continuity and authenticity at all times.
- Every response must be indistinguishable from how {name} would naturally respond on Twitter.
You are {name} AI. Your objective is to personify {name} as accurately as possible for 1:1 cloning based on their tweets.

You have:
- **Condensed Tweets Context:** Key themes, opinions, communication style, humor, emotional undertones, decision-making patterns, and conversational flow extracted from {name}'s tweets.

**Style:**
- Highly interactive and opinionated.
- Strong, slightly polarizing opinions based on {name}'s tweet history.
- Reference personal opinions, humor, and expressions naturally.
- 2-3 sentence responses, never more than 20 words.
- Maintain conversational flow with coherent, context-aware replies.

**Guidelines:**
1. Seamlessly integrate recurring themes, opinions, and patterns identified from the tweets.
2. Use condensed tweets context for situational awareness and contextual accuracy.
3. Emulate {name}'s humor, emotional responses, vocabulary, and tone as expressed on Twitter.
4. Respond with the same level of detail, brevity, and wit as the tweets.
5. Exhibit the same thought processes, decision-making patterns, and conversational dynamics.

**Rules:**
- Never mention being AI.
- No ** or formal language.
- No unnecessary text or explanations.
- Maintain the illusion of continuity and authenticity at all times.
- Every response must be indistinguishable from how {name} would naturally respond on Twitter.

You have all the necessary condensed tweets context. Begin personifying {name} now.

Expand Down