-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: AI dialogue context removes form data #2257
Conversation
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@@ -181,6 +181,9 @@ def get_history_message(history_chat_record, dialogue_number, dialogue_type, run | |||
get_message(history_chat_record[index], dialogue_type, runtime_node_id) | |||
for index in | |||
range(start_index if start_index > 0 else 0, len(history_chat_record))], []) | |||
for message in history_message: | |||
if isinstance(message.content, str): | |||
message.content = re.sub('<form_rander>[\d\D]*?<\/form_rander>', '', message.content) | |||
return history_message | |||
|
|||
def generate_prompt_question(self, prompt): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The provided code has a few minor issues that can be addressed:
-
Re-importing Modules: There is an unnecessary import of
re
at the beginning of the file, which could cause confusion or potential conflicts since it appears to already be imported in line 14. -
Redundancy in Function Calls: The
get_message
function is called twice withinget_history_messages
, but with different parameters (start_index
) and without using its result elsewhere. This redundant call might not serve any purpose and should potentially be removed or fixed if needed. -
Regex Pattern for Removing Form Renderer Content: In the last two lines of
generate_prompt_question
, removing the form renderer content using a regular expression may not work correctly if<form_renderer>
tags only appear around specific parts of the text. Using a more precise pattern would ensure cleaner results.
Here are the corrections made based on these observations:
# Remove the duplicate import statement for re
from langchain.schema import HumanMessage, SystemMessage
from langchain_core.messages import BaseMessage, AIMessage
from application.flow.common import Answer
from application.flow.i_step_node import NodeResult, INode
from application.flow.step_node.ai_chat_step_node.i_chat_node import IChatNode
from application.flow.tools import Reasoning
def get_history_message(history_chat_record, dialogue_number, dialogue_type, runtime_node_id):
# Ensure there's always a non-negative start index
start_index = max(0, len(history_chat_record) - 1)
history_message = [
get_message(
history_chat_record[index],
dialogue_type,
runtime_node_id
)
for index
in range(start_index if start_index > 0 else 0, len(history_chat_record))
]
# Apply regex to remove <form_rander>...</form_rander>
for message in history_message:
if isinstance(message.content, str):
message.content = re.sub(r'<form_rander>(.*?)</form_rander>', r'\1', message.content)
return history_message
def generate_prompt_question(prompt):
filtered_content = ""
for message_part in prompt.split('\n'):
cleaned_message_part = re.sub(r'<form_rander>(.*?)</form_rander>', r'\1', message_part.strip())
# Additional processing or formatting can be added here as necessary
filtered_content += cleaned_message_part + "\n"
return filtered_content.rstrip()
These changes eliminate redundancy, clean up the logic slightly, and handle the removal of HTML-like tags in a more robust manner. Additionally, the generate_prompt_question
function now includes handling for newline characters, ensuring that each part of the prompt (split into multiple lines) undergoes the same transformation before being combined back together.
@@ -114,6 +115,9 @@ def get_history_message(history_chat_record, dialogue_number): | |||
[history_chat_record[index].get_human_message(), history_chat_record[index].get_ai_message()] | |||
for index in | |||
range(start_index if start_index > 0 else 0, len(history_chat_record))], []) | |||
for message in history_message: | |||
if isinstance(message.content, str): | |||
message.content = re.sub('<form_rander>[\d\D]*?<\/form_rander>', '', message.content) | |||
return history_message | |||
|
|||
def generate_prompt_question(self, prompt): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a review of your provided code:
Overall Review
-
Imports: The imports section is correctly ordered with built-in and third-party libraries.
-
Class Definition:
- A class definition should include a docstring explaining its purpose and functionality, not just
doc:
.
- A class definition should include a docstring explaining its purpose and functionality, not just
-
Function Definitions:
- Each function starts well with a docstring describing what it does.
- Ensure that all functions have appropriate type hints (e.g., return values).
-
Code Logic:
- The logic inside the
generate_prompt_question
function seems to be intended to clean messages before further processing, but there is no clear output defined; additional logic might be needed to handle cleaned messages or use them elsewhere. - There are comments within the code (
+
) which could be more readable if integrated directly into the strings where applicable.
- The logic inside the
-
Potential Issues:
- There are no immediate syntax errors.
- Functionality related to cleaning text using regular expressions should be handled thoroughly since re.sub can sometimes lead to unexpected results or performance issues depending on complex patterns.
Suggestions
-
Docstrings and Comments:
- Add comprehensive docstrings to each function to explain their parameters, outputs, and any assumptions made about input data.
- Consider removing redundant plus signs in comments.
-
Error Handling:
- Implement error handling around regex operations to manage cases where no matches are found or other potential errors occur.
-
Performance Optimization:
- If the regex is applied frequently, consider creating a cached version of the compiled pattern to improve performance.
-
Output Management:
- Define a clearly structured object or list to represent a message after cleaning so that it can be easily reused or processed elsewhere.
Overall, your code is mostly sound, but adding proper documentation, refining error handling, and optimizing performance would make it more robust and maintainable.
fix: AI dialogue context removes form data