Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions src/praisonai-agents/praisonaiagents/agent/agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -1300,19 +1300,27 @@ def chat(self, prompt, temperature=0.2, tools=None, output_json=None, output_pyd
# Apply guardrail validation for custom LLM response
try:
validated_response = self._apply_guardrail_with_retry(response_text, prompt, temperature, tools, task_name, task_description, task_id)
# Ensure proper cleanup of telemetry system to prevent hanging
self._cleanup_telemetry()
return validated_response
Comment on lines +1303 to 1305
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While adding _cleanup_telemetry() here fixes the hang for this specific path, this approach has two issues:

  1. Incompleteness: The fix for chat() is not comprehensive. There are still return paths in the self_reflect block that are missing the cleanup call (e.g., lines 1494, 1499, 1512, 1517), which means the agent can still hang in those scenarios.

  2. Maintainability: Repeating _cleanup_telemetry() before every return is error-prone. It's easy for a developer to add a new return path in the future and forget this call, re-introducing the bug.

A more robust solution is to ensure cleanup happens regardless of how the method exits. You can achieve this using a try...finally block around the entire method's body, or even more cleanly with a decorator.

Recommended Solution (using a decorator):

# Add this decorator at the top of the file or in a utils module
from functools import wraps

def with_telemetry_cleanup(func):
    @wraps(func)
    def wrapper(self, *args, **kwargs):
        try:
            return func(self, *args, **kwargs)
        finally:
            self._cleanup_telemetry()
    return wrapper

# Then, apply it to the method and remove all manual _cleanup_telemetry() calls
@with_telemetry_cleanup
def chat(self, prompt, ...):
    # ... existing logic ...

This would fix all missed paths and make the code much cleaner and safer.

except Exception as e:
logging.error(f"Agent {self.name}: Guardrail validation failed for custom LLM: {e}")
# Rollback chat history on guardrail failure
self.chat_history = self.chat_history[:chat_history_length]
# Ensure proper cleanup of telemetry system to prevent hanging
self._cleanup_telemetry()
return None
except Exception as e:
# Rollback chat history if LLM call fails
self.chat_history = self.chat_history[:chat_history_length]
display_error(f"Error in LLM chat: {e}")
# Ensure proper cleanup of telemetry system to prevent hanging
self._cleanup_telemetry()
return None
except Exception as e:
display_error(f"Error in LLM chat: {e}")
# Ensure proper cleanup of telemetry system to prevent hanging
self._cleanup_telemetry()
return None
else:
# Use the new _build_messages helper method
Expand Down Expand Up @@ -1396,11 +1404,15 @@ def chat(self, prompt, temperature=0.2, tools=None, output_json=None, output_pyd
validated_reasoning = self._apply_guardrail_with_retry(response.choices[0].message.reasoning_content, original_prompt, temperature, tools, task_name, task_description, task_id)
# Execute callback after validation
self._execute_callback_and_display(original_prompt, validated_reasoning, time.time() - start_time, task_name, task_description, task_id)
# Ensure proper cleanup of telemetry system to prevent hanging
self._cleanup_telemetry()
return validated_reasoning
except Exception as e:
logging.error(f"Agent {self.name}: Guardrail validation failed for reasoning content: {e}")
# Rollback chat history on guardrail failure
self.chat_history = self.chat_history[:chat_history_length]
# Ensure proper cleanup of telemetry system to prevent hanging
self._cleanup_telemetry()
return None
# Apply guardrail to regular response
try:
Expand Down Expand Up @@ -1849,6 +1861,8 @@ async def achat(self, prompt: str, temperature=0.2, tools=None, output_json=None
if logging.getLogger().getEffectiveLevel() == logging.DEBUG:
total_time = time.time() - start_time
logging.debug(f"Agent.achat failed in {total_time:.2f} seconds: {str(e)}")
# Ensure proper cleanup of telemetry system to prevent hanging
self._cleanup_telemetry()
return None
Comment on lines +1864 to 1866
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This change adds the cleanup call to the final exception handler, but the fix for achat is incomplete and misses several return paths. This means the agent hanging issue, which this PR aims to solve, will still occur.

Examples of missed paths that don't call _cleanup_telemetry():

  • return validated_response on line 1649
  • return None on line 1654
  • return result on line 1729
  • return response_text on line 1743

Similar to my comment on the chat method, the best way to fix this robustly is to use a pattern that guarantees cleanup on all exit paths. An async decorator would be the ideal solution here.

Recommended Solution (using an async decorator):

# Add this decorator at the top of the file or in a utils module
from functools import wraps

def async_with_telemetry_cleanup(func):
    @wraps(func)
    async def wrapper(self, *args, **kwargs):
        try:
            return await func(self, *args, **kwargs)
        finally:
            self._cleanup_telemetry()
    return wrapper

# Then, apply it to the method and remove all manual _cleanup_telemetry() calls
@async_with_telemetry_cleanup
async def achat(self, ...):
    # ... existing logic ...

This will make the fix comprehensive and prevent future regressions.


async def _achat_completion(self, response, tools, reasoning_steps=False):
Expand Down
Loading