Skip to content

Conversation

gwaramadze
Copy link
Contributor

Problem

The Quix Portal API calls were failing with DNS resolution errors (httpx.ConnectError) during temporary network issues, causing applications to crash instead of gracefully handling transient connectivity problems.

Solution

Implemented a retry mechanism with exponential backoff for connection-related errors:

  • Retry decorator with 5 attempts and exponential backoff (2s, 4s, 8s, 16s, 32s)
  • Applied to all 8 HTTP methods in QuixPortalApiService
  • Handles both DNS errors and timeouts while preserving fast-fail for legitimate API errors (404, 401, etc.)
  • Comprehensive logging for retry attempts and failures

Changes

  • Added retry_on_connection_error decorator in quixstreams/platforms/quix/api.py
  • Applied decorator to all Portal API methods (get_librdkafka_connection_config, get_workspace, etc.)
  • Added 4 tests covering success scenarios, max retries, and different error types

This resolves intermittent DNS and connection timeout issues when connecting to Quix Cloud, making the SDK more resilient to network instability.

@daniil-quix daniil-quix merged commit d8346f4 into main Sep 16, 2025
4 checks passed
@daniil-quix daniil-quix deleted the fix/retry-dns-errors branch September 16, 2025 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants