Enhance: Improve error handling in Discord agent #102

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

striver-24 wants to merge 1 commit into jentic:main from striver-24:enhance/discord-error-handling

striver-24 commented Sep 29, 2025 •

edited

Loading

Closes #89

Added a 30-second rate limit per user and better checks for API keys and user input (blocking bad/harmful requests).
Implemented retry logic and exponential backoff to handle Discord's errors, plus a 5-minute timeout to prevent the bot from hanging.
Users get clearer error messages, and now have detailed logs for troubleshooting.


          enhance: Improve error handling and robustness in Discord agent

b3ec896

- Add comprehensive rate limiting for users (30-second cooldown)
- Implement robust Discord API error handling with retry logic
- Enhance API key validation with length and character checks
- Add input validation for user requests (length, content, harmful keywords)
- Improve user-facing error messages with emojis and clear descriptions
- Add timeout handling for agent processing (5-minute limit)
- Implement graceful error recovery and rollback mechanisms
- Add detailed structured logging for debugging and monitoring
- Handle Discord connection/disconnection events with user feedback
- Add exponential backoff for server errors and rate limit compliance

This significantly improves the bot's production readiness and user experience.

striver-24 requested a review from a team as a code owner

September 29, 2025 20:18

striver-24 mentioned this pull request

[Good First Issue] [Enhancement] Improve Discord bot error handling #89

Open

5 tasks

striver-24 changed the title ~~Enhance: Improve error handling and robustness in Discord agent (Issue No. #89)~~ Enhance: Improve error handling in Discord agent

Collaborator

rishikesh-jentic commented Oct 3, 2025

Thanks so much for this PR and for taking the time to contribute to Standard Agent — welcome aboard 🙌
We’ll review and share feedback soon! 🚀

Author

striver-24 commented Oct 3, 2025

Thank you for the warm welcome! I’m thrilled to contribute to Standard Agent and eagerly await your feedback. 😊

rishikesh-jentic reviewed

View reviewed changes

examples/discord/discord_agent.py

    
                              if attempt < max_retries - 1:

                                  await asyncio.sleep(retry_after)

                                  continue

                              elif exc.status == 403: # Forbidden

Collaborator

rishikesh-jentic Oct 8, 2025

The elif exc.status == 403 and elif exc.status >= 500 are paired with the wrong if statement - they're currently at the same level as if attempt < max_retries - 1, should they be at the same level as if exc.status == 429 (one indent back) ?

rishikesh-jentic reviewed

View reviewed changes

examples/discord/discord_agent.py

    
                          if attempt == max_retries - 1:

                              return False

                          await asyncio.sleep(1)

                      return False

Collaborator

rishikesh-jentic Oct 8, 2025

The return False at the end is inside the for loop - this means it will exit after the first iteration that doesn't explicitly return or continue, instead of trying the remaining retries. Is that intentional ?

rishikesh-jentic reviewed

View reviewed changes

examples/discord/discord_agent.py

    
              logger = get_logger(__name__)

              #Rate Limit Tracking

              user_last_request = {}

Collaborator

rishikesh-jentic Oct 8, 2025

Love the addition of rate limiting! 🎯

One thing to consider: the user_last_request dictionary will grow indefinitely as new users interact with the bot. In a long-running bot, this could accumulate thousands of entries over time and cause memory issues.

You might want to add some cleanup logic - would a Deque be worth considering here ?

Just something to think about for production environments where the bot might run for weeks/months!

rishikesh-jentic reviewed

View reviewed changes

examples/discord/discord_agent.py

    
                                  return

                              # Check for potentially harmful content

                              if any(word in goal.lower() for word in ['hack', 'exploit', 'malware', 'virus']):

Collaborator

rishikesh-jentic Oct 8, 2025

I appreciate the thought behind adding content filtering! However, simple keyword-based filtering tends to have some challenges:

False positives: Legitimate questions like "How do I protect against hackers?" would be blocked
Easy to bypass: Simple character substitutions (h4ck, expl0it) make it ineffective
Blocks valid use cases: Security research, educational questions, etc.

I'd suggest either removing this check or considering a more robust solution.

I would like to know your inputs here.

rishikesh-jentic reviewed

View reviewed changes

examples/discord/discord_agent.py

    
                          if reasoning_strategy not in valid:

                              await interaction.response.send_message("Usage: /standard_agent reasoner reasoning_strategy", ephemeral=True)

                          if len(key) < 10:

Collaborator

rishikesh-jentic Oct 8, 2025

Quick question about the API key validation checks (length, character restrictions, etc.) - I'm wondering if these might cause issues down the line?

These checks assume specific format requirements that we can't really guarantee:

API key formats can change over time
Different environments might use different formats
Valid keys might be rejected, or invalid ones might pass these checks

The only way to truly validate an API key is to try using it. I'd suggest either removing these format checks or replacing them with an actual API test call during configuration.

What do you think? Is there a specific key format you're trying to enforce here?

rishikesh-jentic reviewed

View reviewed changes

examples/discord/discord_agent.py

    
                                  )

                                  logger.info("reasoner_changed_no_reload", old=old_profile, new=runtime.chosen_profile)

                          except discord.InteractionResponded:

Collaborator

rishikesh-jentic Oct 8, 2025

I notice there's a recurring pattern for handling discord.InteractionResponded exceptions throughout the code (in configure, kill, reasoner, etc.). A couple of thoughts:

InteractionResponded usually indicates a logic bug - if this exception is being raised, it means we tried to respond to an interaction twice, which shouldn't happen in normal flow. Rather than catching it everywhere, it might be worth investigating why it could occur.
The pattern is repeated ~10 times - this makes the code harder to maintain. If we need to change the error handling logic, we'd have to update it in many places.

Consider either:

Simplifying to just handle the expected exceptions (the InteractionResponded catches might be masking bugs)
Or creating a decorator/wrapper if this error handling is truly necessary

This would make the code cleaner and easier to reason about. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet