Fix google calendar and notion errors#768
Conversation
|
@manojag115 is attempting to deploy a commit to the Rohan Verma's projects Team on Vercel. A member of the Team first needs to authorize it. |
There was a problem hiding this comment.
Review by RecurseML
🔍 Review performed on 6c94ffe..48e6466
| Severity | Location | Issue | Delete |
|---|---|---|---|
| surfsense_backend/app/tasks/connector_indexers/base.py:171 | Unhandled datetime parsing error | ||
| surfsense_backend/app/tasks/connector_indexers/google_calendar_indexer.py:229 | Unhandled datetime parsing error |
✅ Files analyzed, no issues (3)
• surfsense_backend/app/tasks/celery_tasks/connector_tasks.py
• surfsense_backend/app/tasks/connector_indexers/notion_indexer.py
• surfsense_backend/app/tasks/connector_indexers/webcrawler_indexer.py
| "adjusting end date to next day to ensure valid date range" | ||
| ) | ||
| # Parse end_date and add 1 day | ||
| end_dt = datetime.strptime(end_date_str, "%Y-%m-%d") |
There was a problem hiding this comment.
Critical runtime error: datetime.strptime() will crash with ValueError if end_date_str is not in 'YYYY-MM-DD' format. This occurs when both start_date and end_date are provided by the user in a non-standard format (lines 122-123 return them as-is without validation). If these equal dates are in ISO format (e.g., '2024-01-21T00:00:00+00:00') or any other format, the code will crash when trying to parse them at line 171.
The flow: User provides both dates → Line 123 returns them without validation → Lines 165-171 detect they're equal and try to parse → ValueError crash if format is not 'YYYY-MM-DD'.
Fix: Add try-except around datetime.strptime() or validate/normalize date format before this check.
React with 👍 to tell me that this comment was useful, or 👎 if not (and I'll stop posting more comments like this in the future)
| "adjusting end date to next day to ensure valid date range" | ||
| ) | ||
| # Parse end_date and add 1 day | ||
| end_dt = datetime.strptime(end_date_str, "%Y-%m-%d") |
There was a problem hiding this comment.
Critical runtime error: datetime.strptime() will crash with ValueError if end_date_str is not in 'YYYY-MM-DD' format. This Google Calendar indexer has its own date calculation logic (lines 179-218) that returns user-provided dates directly at line 217-218 without format validation. If a user provides dates in ISO format (e.g., '2024-01-21T00:00:00+00:00') or any other non-standard format, and they happen to be equal, the code will crash at line 229 when trying to parse with the strict 'YYYY-MM-DD' format.
The flow: User provides both dates in non-standard format → Line 218 assigns them as-is → Line 223 detects they're equal → Line 229 tries to parse → ValueError crash.
Fix: Add try-except around datetime.strptime() or validate/normalize date format before this check.
React with 👍 to tell me that this comment was useful, or 👎 if not (and I'll stop posting more comments like this in the future)
|
@manojag115 If a notion page contains transcription/ai blocks, will it still skip those blocks and index the other available blocks? The google calendar fix should already work, can you please check |
@AnishSarkar22 Good catch on the Google Calendar fix! You're right. it's already handled in search_source_connectors_routes for the manual sync path. My changes to base.py and google_calendar_indexer.py are redundant for route-triggered syncs, but provide defense for periodic/scheduled syncs that may call the indexer directly via calculate_date_range. Happy to remove them if you prefer to keep it DRY. For Notion: Yes, it does skip unsupported blocks and continue indexing the rest of the page. The handling is in notion_history.py:
The errors we're seeing in prod are likely edge cases where the unsupported block is at the page root and Notion fails the entire blocks.children.list call before returning any blocks. My change just improves the error message clarity to indicate it's a known Notion API limitation, not an application bug. |
@manojag115 I think google calendar fix can be removed, it is redundant.
Yeah its better to show improved message for better clarity to the user. Thank you so much for your hard work. |
|
@CREDO23 Please review this PR and let me know if we need any changes. |
CREDO23
left a comment
There was a problem hiding this comment.
@manojag115, please resolve recurseml[bot] comments!
| "adjusting end date to next day to ensure valid date range" | ||
| ) | ||
| # Parse end_date and add 1 day | ||
| end_dt = datetime.strptime(end_date_str, "%Y-%m-%d") |
|
@CREDO23 the recurseml comments are now fixed. |
Description
Fixes 2 production errors affecting connector sync operations and adds debug logging to help diagnose remaining issues.
last_indexed_atis today)transcription,ai_block) to clarify these are known Notion API limitations, not application errorsINITIAL_URLSraw value_handle_greenlet_error()helper to Celery tasks with specific detection and logging for async/sync context issuesMotivation and Context
FIX #
The issues being fixed or adding debug logs for are:
Screenshots
API Changes
Change Type
Testing Performed
Was able to verify google calendar fix locally works. Notion one is expected, so just basic error handling
Checklist
High-level PR Summary
This PR fixes production errors affecting connector sync operations and enhances debugging capabilities. It resolves a Google Calendar date range validation error that occurred when
start_dateequalsend_dateby automatically adjusting the end date to be one day later. For Notion, it improves error handling by distinguishing unsupported block types (transcription,ai_block) as known API limitations rather than application errors. Additionally, the PR adds comprehensive debug logging for webcrawler connector URL issues and introduces a helper function to detect and log SQLAlchemy greenlet errors with detailed context for easier troubleshooting.⏱️ Estimated Review Time: 15-30 minutes
💡 Review Order Suggestion
surfsense_backend/app/tasks/connector_indexers/base.pysurfsense_backend/app/tasks/connector_indexers/google_calendar_indexer.pysurfsense_backend/app/tasks/connector_indexers/notion_indexer.pysurfsense_backend/app/tasks/connector_indexers/webcrawler_indexer.pysurfsense_backend/app/tasks/celery_tasks/connector_tasks.py