Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: AI dialogue context removes form data #2257

Merged
merged 1 commit into from
Feb 12, 2025

Conversation

shaohuzhang1
Copy link
Contributor

fix: AI dialogue context removes form data

Copy link

f2c-ci-robot bot commented Feb 12, 2025

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link

f2c-ci-robot bot commented Feb 12, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@@ -181,6 +181,9 @@ def get_history_message(history_chat_record, dialogue_number, dialogue_type, run
get_message(history_chat_record[index], dialogue_type, runtime_node_id)
for index in
range(start_index if start_index > 0 else 0, len(history_chat_record))], [])
for message in history_message:
if isinstance(message.content, str):
message.content = re.sub('<form_rander>[\d\D]*?<\/form_rander>', '', message.content)
return history_message

def generate_prompt_question(self, prompt):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The provided code has a few minor issues that can be addressed:

  1. Re-importing Modules: There is an unnecessary import of re at the beginning of the file, which could cause confusion or potential conflicts since it appears to already be imported in line 14.

  2. Redundancy in Function Calls: The get_message function is called twice within get_history_messages, but with different parameters (start_index) and without using its result elsewhere. This redundant call might not serve any purpose and should potentially be removed or fixed if needed.

  3. Regex Pattern for Removing Form Renderer Content: In the last two lines of generate_prompt_question, removing the form renderer content using a regular expression may not work correctly if <form_renderer> tags only appear around specific parts of the text. Using a more precise pattern would ensure cleaner results.

Here are the corrections made based on these observations:

# Remove the duplicate import statement for re
from langchain.schema import HumanMessage, SystemMessage
from langchain_core.messages import BaseMessage, AIMessage

from application.flow.common import Answer
from application.flow.i_step_node import NodeResult, INode
from application.flow.step_node.ai_chat_step_node.i_chat_node import IChatNode
from application.flow.tools import Reasoning

def get_history_message(history_chat_record, dialogue_number, dialogue_type, runtime_node_id):
    # Ensure there's always a non-negative start index
    start_index = max(0, len(history_chat_record) - 1)
    
    history_message = [
        get_message(
            history_chat_record[index],
            dialogue_type,
            runtime_node_id
        )
        for index
        in range(start_index if start_index > 0 else 0, len(history_chat_record))
    ]
    
    # Apply regex to remove <form_rander>...</form_rander>
    for message in history_message:
        if isinstance(message.content, str):
            message.content = re.sub(r'<form_rander>(.*?)</form_rander>', r'\1', message.content)
    return history_message

def generate_prompt_question(prompt):
    filtered_content = ""
    for message_part in prompt.split('\n'):
        cleaned_message_part = re.sub(r'<form_rander>(.*?)</form_rander>', r'\1', message_part.strip())
        # Additional processing or formatting can be added here as necessary
        filtered_content += cleaned_message_part + "\n"
    return filtered_content.rstrip()

These changes eliminate redundancy, clean up the logic slightly, and handle the removal of HTML-like tags in a more robust manner. Additionally, the generate_prompt_question function now includes handling for newline characters, ensuring that each part of the prompt (split into multiple lines) undergoes the same transformation before being combined back together.

@shaohuzhang1 shaohuzhang1 merged commit 4874c0e into main Feb 12, 2025
4 checks passed
@shaohuzhang1 shaohuzhang1 deleted the pr@main@fix_ai_chat_exclude_form_data branch February 12, 2025 08:07
@@ -114,6 +115,9 @@ def get_history_message(history_chat_record, dialogue_number):
[history_chat_record[index].get_human_message(), history_chat_record[index].get_ai_message()]
for index in
range(start_index if start_index > 0 else 0, len(history_chat_record))], [])
for message in history_message:
if isinstance(message.content, str):
message.content = re.sub('<form_rander>[\d\D]*?<\/form_rander>', '', message.content)
return history_message

def generate_prompt_question(self, prompt):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a review of your provided code:

Overall Review

  1. Imports: The imports section is correctly ordered with built-in and third-party libraries.

  2. Class Definition:

    • A class definition should include a docstring explaining its purpose and functionality, not just doc:.
  3. Function Definitions:

    • Each function starts well with a docstring describing what it does.
    • Ensure that all functions have appropriate type hints (e.g., return values).
  4. Code Logic:

    • The logic inside the generate_prompt_question function seems to be intended to clean messages before further processing, but there is no clear output defined; additional logic might be needed to handle cleaned messages or use them elsewhere.
    • There are comments within the code (+) which could be more readable if integrated directly into the strings where applicable.
  5. Potential Issues:

    • There are no immediate syntax errors.
    • Functionality related to cleaning text using regular expressions should be handled thoroughly since re.sub can sometimes lead to unexpected results or performance issues depending on complex patterns.

Suggestions

  1. Docstrings and Comments:

    • Add comprehensive docstrings to each function to explain their parameters, outputs, and any assumptions made about input data.
    • Consider removing redundant plus signs in comments.
  2. Error Handling:

    • Implement error handling around regex operations to manage cases where no matches are found or other potential errors occur.
  3. Performance Optimization:

    • If the regex is applied frequently, consider creating a cached version of the compiled pattern to improve performance.
  4. Output Management:

    • Define a clearly structured object or list to represent a message after cleaning so that it can be easily reused or processed elsewhere.

Overall, your code is mostly sound, but adding proper documentation, refining error handling, and optimizing performance would make it more robust and maintainable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant