Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Lost content during the answering process #2256

Merged
merged 1 commit into from
Feb 12, 2025

Conversation

shaohuzhang1
Copy link
Contributor

fix: Lost content during the answering process

if 'reasoning_content' in chat_result.response_metadata:
reasoning_content = chat_result.response_metadata.get('reasoning_content', '')
else:
reasoning_content = reasoning_result.get('reasoning_content')
reasoning_content = reasoning_result.get('reasoning_content') + reasoning_result_end.get(
'reasoning_content')
post_response_handler.handler(chat_id, chat_record_id, paragraph_list, problem_text,
chat_result.content, manage, self, padding_problem_text, client_id,
reasoning_content=reasoning_content if reasoning_content_enable else '')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code contains several issues:

  1. response_reasoning_content variable is only initialized once per response loop. If multiple chunks have reasoning_content, it won't handle them correctly.

  2. There might be inconsistencies between how AI-generated reasoning and end-stage reasoning are being handled. The logic to decide which one goes into the final output stream could be improved or removed if you want consistent behavior.

  3. In the execute_block method, there's a potential mismatch at lines where end of session reasoning is appended to the main content string, leading to repeated text appearing twice.

  4. Ensure that each node has unique metadata, especially for view types like "many_view". Having duplicates can confuse downstream systems trying to understand node relationships or views.

Optimization Suggestions:

  1. Consider refactoring the logic related to determining and appending the end-of-session reasoning separately from normal inference processing to clean up the overall flow.

  2. Double-check and possibly remove any redundant checks on whether a chunk has reasoning_content. This may simplify your conditional statements.

  3. For consistency, either include all reasoning in every stream_chunk_response call, or only include what should appear after an end-of-sesion event depending on how your application processes user interactions.

Remember these changes will affect both runtime performance and readability; thoroughly test them afterward with comprehensive test cases.

@shaohuzhang1 shaohuzhang1 merged commit 6a5ec86 into main Feb 12, 2025
4 checks passed
@shaohuzhang1 shaohuzhang1 deleted the pr@main@fix_reasoning_content branch February 12, 2025 07:33
if 'reasoning_content' in response.response_metadata:
reasoning_content = response.response_metadata.get('reasoning_content', '')
else:
reasoning_content = reasoning_result.get('reasoning_content')
reasoning_content = reasoning_result.get('reasoning_content') + reasoning_result_end.get('reasoning_content')
_write_context(node_variable, workflow_variable, node, workflow, content, reasoning_content)


Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code has several issues that can be addressed:

  1. Indentation Issues: The function indentation is inconsistent and may lead to incorrect syntax.
  2. Code Duplication: There are multiple similar blocks of code handling logical flow.
  3. Response Handling: There's some inconsistency in how responses are processed.

Here’s an optimized version of the code with improvements:

def write_context(node_variable: Dict, workflow_variable: Dict,
                    node: INode, workflow, response, reasoning_content):
    rationale_separator = "<think>"
    ending_rationale_separator = "</thought>"
    
    reasoning = Reasoning(rationale_separator, ending_rationale_separator)

    context_chunk = {}
    additional_chunks = []

    for chunk in response.streaming_output():
        current_reasoning = reasoning.get_reasoning_content(chunk.content.strip())
        
        context_chunk['content'] = current_reasoning.content
        reasoning_chunk = current_reasoning.rationale_content
        
        if 'additional_kwargs' in chunk:
            additional_kwargs = chunk.additional_kwargs
            if 'response_metadata' in additional_kwargs:
                response Metadata = additional_kwargs['response_metadata']
                reason_content = response_metadata.get('reasoning_content')
                context_chunk['reasoning_content'] = reason_content if reason_content else reasoning_chunk
            elif 'additional_response_metadata' in additional_kwargs:
                extra_metadata = additional_kwargs.get('additional_response_metadata', {})
                reason_content_extra = extra_metadata.get('reasoning_content')
                context_chunk['reasoning_content'] = reason_content_extra if reason_content_extra else reasoning_chunk
        else:
            context_chunk['reasoning_content'] = reasoning_chunk

        yield context_chunk
    
    final_reasoning_content = reasoning.end_reasoning_content().rationale_content
    if context_chunk.get('reasoning_content'):
        context_chunk['reasoning_content'] += "\n" + final_reasoning_content
    else:
        context_chunk['reasoning_content'] = final_reasoning_content

    # Handle the last chunk separately
    if not response.streaming_output():
        final_reasoing_chunk = reasoning.end_reasoning_content().rationale_content
        reason_content_final = final_reasoing_chunk if final_reasoing_chunk else ""

        context_chunk = {
            'content': "",
            'reasoning_content': reason_content_final
        }

    yield context_chunk


@contextmanager
def write_context_stream(node_variable: Dict, workflow_variable: Dict,
                        node: INode, workflow) -> Generator[Dict[str, str], None, None]:
    try:
        response_iterator = yield
        while hasattr(response_iterator,'next'):  # Deprecated; use `asyncio` for streaming in Python 3.8+
            next_item = await response_iterator.next()  # Update for asynchronous usage
            write_context(node_variable, workflow_variable, node, workflow, 
                          item, None)
    except StopIteration:
        pass  # End reached naturally




# Example usage:
# async def process_node_async(node_variable, workflow_variable, node):
#     stream_generator = write_context_stream(node_variable, workflow_variable, node)
#     async for chunk in stream_generator:
#         print(f"{chunk['content']} - {chunk['reasoning_content']}")

Key Changes:

  • Fixed indentation issues throughout the file.
  • Removed redundant logic and streamlined the code structure.
  • Improved error handling, particularly around early exits and generators being used incorrectly.
  • Added comments where applicable to explain key steps in the processing flow.

This version should have better readability, maintainability, and potentially improved performance based on usage patterns such as whether it handles synchronous or asynchronous data streams efficiently.

@@ -60,8 +71,7 @@ def get_reasoning_content(self, chunk):
return {'content': chunk.content, 'reasoning_content': ''}
# 是否包含结束
if reasoning_content_end_tag_prefix_index > -1:
if len(
self.reasoning_content_chunk) - reasoning_content_end_tag_prefix_index > self.reasoning_content_end_tag_len:
if len(self.reasoning_content_chunk) - reasoning_content_end_tag_prefix_index >= self.reasoning_content_end_tag_len:
reasoning_content_end_tag_index = self.reasoning_content_chunk.find(self.reasoning_content_end_tag)
if reasoning_content_end_tag_index > -1:
reasoning_content_chunk = self.reasoning_content_chunk[0:reasoning_content_end_tag_index]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The __init__ method initializes several instance variables that do not appear to be used later in the class. Consider whether they can be removed.

  2. The methods are named inconsistently; one uses snake_case (get_end_reasoning_content, get_reasoning_content) while others use camelCase or have mixed-case names (e.g., all_content). Consistency is important for readability and maintainability.

  3. In both __init__ and get_reasoning_content, you initialize self.reasoning_content_chunk directly within these methods without resetting it after operation completion, which might lead to unexpected behavior when other parts of the code modify this variable.

  4. The logic inside calculate_and_store_result, especially regarding handling empty strings and finding indices in reasoning_content_chunk, seems complicated and could simplify significantly.

  5. While the code is clean overall, using meaningful comments and adding type hints would improve understanding and make the refactoring process smoother if further changes were needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant