fix: Lost content during the answering process #2256

shaohuzhang1 · 2025-02-12T07:32:57Z

fix: Lost content during the answering process

shaohuzhang1 · 2025-02-12T07:33:15Z

apps/application/chat_pipeline/step/chat_step/impl/base_chat_step.py

            if 'reasoning_content' in chat_result.response_metadata:
                reasoning_content = chat_result.response_metadata.get('reasoning_content', '')
            else:
-                reasoning_content = reasoning_result.get('reasoning_content')
+                reasoning_content = reasoning_result.get('reasoning_content') + reasoning_result_end.get(
+                    'reasoning_content')
            post_response_handler.handler(chat_id, chat_record_id, paragraph_list, problem_text,
                                          chat_result.content, manage, self, padding_problem_text, client_id,
                                          reasoning_content=reasoning_content if reasoning_content_enable else '')


The code contains several issues:

response_reasoning_content variable is only initialized once per response loop. If multiple chunks have reasoning_content, it won't handle them correctly.

There might be inconsistencies between how AI-generated reasoning and end-stage reasoning are being handled. The logic to decide which one goes into the final output stream could be improved or removed if you want consistent behavior.

In the execute_block method, there's a potential mismatch at lines where end of session reasoning is appended to the main content string, leading to repeated text appearing twice.

Ensure that each node has unique metadata, especially for view types like "many_view". Having duplicates can confuse downstream systems trying to understand node relationships or views.

Optimization Suggestions:

Consider refactoring the logic related to determining and appending the end-of-session reasoning separately from normal inference processing to clean up the overall flow.

Double-check and possibly remove any redundant checks on whether a chunk has reasoning_content. This may simplify your conditional statements.

For consistency, either include all reasoning in every stream_chunk_response call, or only include what should appear after an end-of-sesion event depending on how your application processes user interactions.

Remember these changes will affect both runtime performance and readability; thoroughly test them afterward with comprehensive test cases.

shaohuzhang1 · 2025-02-12T07:33:27Z

apps/application/flow/step_node/ai_chat_step_node/impl/base_chat_node.py

    if 'reasoning_content' in response.response_metadata:
        reasoning_content = response.response_metadata.get('reasoning_content', '')
    else:
-        reasoning_content = reasoning_result.get('reasoning_content')
+        reasoning_content = reasoning_result.get('reasoning_content') + reasoning_result_end.get('reasoning_content')
    _write_context(node_variable, workflow_variable, node, workflow, content, reasoning_content)




The code has several issues that can be addressed:

Indentation Issues: The function indentation is inconsistent and may lead to incorrect syntax.

Code Duplication: There are multiple similar blocks of code handling logical flow.

Response Handling: There's some inconsistency in how responses are processed.

Here’s an optimized version of the code with improvements:

def write_context(node_variable: Dict, workflow_variable: Dict, node: INode, workflow, response, reasoning_content): rationale_separator = "<think>" ending_rationale_separator = "</thought>" reasoning = Reasoning(rationale_separator, ending_rationale_separator) context_chunk = {} additional_chunks = [] for chunk in response.streaming_output(): current_reasoning = reasoning.get_reasoning_content(chunk.content.strip()) context_chunk['content'] = current_reasoning.content reasoning_chunk = current_reasoning.rationale_content if 'additional_kwargs' in chunk: additional_kwargs = chunk.additional_kwargs if 'response_metadata' in additional_kwargs: response Metadata = additional_kwargs['response_metadata'] reason_content = response_metadata.get('reasoning_content') context_chunk['reasoning_content'] = reason_content if reason_content else reasoning_chunk elif 'additional_response_metadata' in additional_kwargs: extra_metadata = additional_kwargs.get('additional_response_metadata', {}) reason_content_extra = extra_metadata.get('reasoning_content') context_chunk['reasoning_content'] = reason_content_extra if reason_content_extra else reasoning_chunk else: context_chunk['reasoning_content'] = reasoning_chunk yield context_chunk final_reasoning_content = reasoning.end_reasoning_content().rationale_content if context_chunk.get('reasoning_content'): context_chunk['reasoning_content'] += "\n" + final_reasoning_content else: context_chunk['reasoning_content'] = final_reasoning_content # Handle the last chunk separately if not response.streaming_output(): final_reasoing_chunk = reasoning.end_reasoning_content().rationale_content reason_content_final = final_reasoing_chunk if final_reasoing_chunk else "" context_chunk = { 'content': "", 'reasoning_content': reason_content_final } yield context_chunk @contextmanager def write_context_stream(node_variable: Dict, workflow_variable: Dict, node: INode, workflow) -> Generator[Dict[str, str], None, None]: try: response_iterator = yield while hasattr(response_iterator,'next'): # Deprecated; use `asyncio` for streaming in Python 3.8+ next_item = await response_iterator.next() # Update for asynchronous usage write_context(node_variable, workflow_variable, node, workflow, item, None) except StopIteration: pass # End reached naturally # Example usage: # async def process_node_async(node_variable, workflow_variable, node): # stream_generator = write_context_stream(node_variable, workflow_variable, node) # async for chunk in stream_generator: # print(f"{chunk['content']} - {chunk['reasoning_content']}")

Key Changes:

Fixed indentation issues throughout the file.

Removed redundant logic and streamlined the code structure.

Improved error handling, particularly around early exits and generators being used incorrectly.

Added comments where applicable to explain key steps in the processing flow.

This version should have better readability, maintainability, and potentially improved performance based on usage patterns such as whether it handles synchronous or asynchronous data streams efficiently.

shaohuzhang1 · 2025-02-12T07:33:32Z

apps/application/flow/tools.py

@@ -60,8 +71,7 @@ def get_reasoning_content(self, chunk):
            return {'content': chunk.content, 'reasoning_content': ''}
        # 是否包含结束
        if reasoning_content_end_tag_prefix_index > -1:
-            if len(
-                    self.reasoning_content_chunk) - reasoning_content_end_tag_prefix_index > self.reasoning_content_end_tag_len:
+            if len(self.reasoning_content_chunk) - reasoning_content_end_tag_prefix_index >= self.reasoning_content_end_tag_len:
                reasoning_content_end_tag_index = self.reasoning_content_chunk.find(self.reasoning_content_end_tag)
                if reasoning_content_end_tag_index > -1:
                    reasoning_content_chunk = self.reasoning_content_chunk[0:reasoning_content_end_tag_index]


The __init__ method initializes several instance variables that do not appear to be used later in the class. Consider whether they can be removed.

The methods are named inconsistently; one uses snake_case (get_end_reasoning_content, get_reasoning_content) while others use camelCase or have mixed-case names (e.g., all_content). Consistency is important for readability and maintainability.

In both __init__ and get_reasoning_content, you initialize self.reasoning_content_chunk directly within these methods without resetting it after operation completion, which might lead to unexpected behavior when other parts of the code modify this variable.

The logic inside calculate_and_store_result, especially regarding handling empty strings and finding indices in reasoning_content_chunk, seems complicated and could simplify significantly.

While the code is clean overall, using meaningful comments and adding type hints would improve understanding and make the refactoring process smoother if further changes were needed.

fix: Lost content during the answering process

320f16f

shaohuzhang1 commented Feb 12, 2025

View reviewed changes

shaohuzhang1 merged commit 6a5ec86 into main Feb 12, 2025
4 checks passed

shaohuzhang1 deleted the pr@main@fix_reasoning_content branch February 12, 2025 07:33

shaohuzhang1 commented Feb 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Lost content during the answering process #2256

fix: Lost content during the answering process #2256

Uh oh!

shaohuzhang1 commented Feb 12, 2025

Uh oh!

shaohuzhang1 Feb 12, 2025

Uh oh!

Uh oh!

shaohuzhang1 Feb 12, 2025

Uh oh!

shaohuzhang1 Feb 12, 2025

Uh oh!

Uh oh!

fix: Lost content during the answering process #2256

fix: Lost content during the answering process #2256

Uh oh!

Conversation

shaohuzhang1 commented Feb 12, 2025

Uh oh!

shaohuzhang1 Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

shaohuzhang1 Feb 12, 2025

Choose a reason for hiding this comment

Key Changes:

Uh oh!

shaohuzhang1 Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!