Fix/fix rate limiter by Jay-tech456 · Pull Request #1067 · Alpaca-Network/gatewayz-backend

Jay-tech456 · 2026-02-07T00:32:12Z

This pull request hardens the gateway’s security and performance by implementing a multi-layered behavioral protection system alongside critical asynchronous optimizations. Highlights include the introduction of Global Velocity Protection to mitigate DoS and IP-rotation attacks via request fingerprinting, and the resolution of a blocking runtime error in the Redis metrics service. By offloading synchronous database and pricing operations to thread pools across the chat completion path, these changes eliminate event loop starvation and significantly reduce the occurrence of 499 timeout errors during periods of high concurrency.

Summary by CodeRabbit

Release Notes

New Features
- Added advanced DoS protection with IP-based rate limiting (including datacenter detection), behavioral fingerprinting to identify bots, and adaptive velocity mode that activates during high error rates.
Improvements
- Enhanced system responsiveness through asynchronous execution of blocking operations.
- Improved connection error detection for more reliable service recovery.

Greptile Overview

Greptile Summary

Adds a new SecurityMiddleware that rate-limits by IP tiering + request fingerprinting and introduces a global “velocity mode” that tightens limits during elevated error rates.
Wires the security middleware into the FastAPI app early in the middleware chain (before timeout/observability), with a Redis-backed path and an in-memory fallback.
Offloads multiple synchronous operations to asyncio.to_thread across admin/catalog/chat routes, and introduces calculate_cost_async to avoid blocking cost/pricing calls.
Updates Redis metrics recording to use asyncio.to_thread for some Redis operations and fixes a missing asyncio import.

Confidence Score: 2/5

This PR is not safe to merge as-is due to a startup-breaking import and correctness issues in the new rate-limiting middleware and Redis metrics threading model.
Score reflects one definite runtime startup failure (get_async_redis_client ImportError) plus two functional issues that will affect production behavior (spoofable XFF rate-limits and local-cache cleanup resetting buckets) and a likely concurrency/thread-safety issue executing a Redis pipeline across threads.
src/main.py, src/middleware/security_middleware.py, src/services/redis_metrics.py

Important Files Changed

Filename	Overview
src/config/supabase_config.py	Adds an extra connection-indicator string ('bad file descriptor') to classify recoverable connection errors.
src/main.py	Adds SecurityMiddleware early in the stack, but imports a non-existent `get_async_redis_client()` causing startup ImportError.
src/middleware/security_middleware.py	Introduces new behavioral rate-limiting middleware (IP tiering, fingerprinting, velocity mode) but trusts spoofable X-Forwarded-For and has a local-cache cleanup bug that can reset other buckets.
src/routes/admin.py	Moves several database/user operations onto threads via `asyncio.to_thread` to reduce event-loop blocking in admin endpoints.
src/routes/api_models.py	Offloads `fetch_specific_model` (and related gateway model fetch) to a thread with `asyncio.to_thread` to avoid blocking in /api/models/detail.
src/routes/catalog.py	Offloads cached model/provider reads to threads in several catalog endpoints to reduce event-loop blocking under load.
src/routes/chat.py	Switches cost calculation to async (`calculate_cost_async`) and offloads more synchronous operations to threads in the chat completion paths.
src/services/pricing.py	Adds async-friendly pricing/cost calculation paths using `asyncio.to_thread` for DB/cache lookups and introduces `calculate_cost_async`.
src/services/redis_metrics.py	Fixes missing asyncio import and moves some Redis operations to `asyncio.to_thread`, but executes a pre-built pipeline in a worker thread which may violate redis client thread-safety.

Sequence Diagram

sequenceDiagram
    participant Client
    participant FastAPI as FastAPI App
    participant Sec as SecurityMiddleware
    participant Timeout as RequestTimeoutMiddleware
    participant Route as Chat Route (/v1/chat/completions)
    participant Pricing as pricing.calculate_cost_async
    participant Redis as Redis (rate-limit/metrics)

    Client->>FastAPI: HTTP request
    FastAPI->>Sec: dispatch(request)
    Sec->>Sec: _get_client_ip() / _generate_fingerprint()
    Sec->>Redis: INCR sec_rl:ip:* + EXPIRE
    Sec->>Redis: INCR sec_rl:fp:* + EXPIRE
    alt over limit
        Sec-->>Client: 429 Too Many Requests
    else allowed
        Sec->>Timeout: call_next(request)
        Timeout->>Route: execute handler
        Route->>Pricing: await calculate_cost_async(model,tokens)
        Pricing-->>Route: cost
        Route-->>Timeout: response
        Timeout-->>Sec: response
        Sec->>Sec: _record_request_outcome(status)
        Sec->>Sec: _check_and_activate_velocity_mode()
        Sec-->>Client: response
    end

…ath. Key improvements include: Async-ified all critical operations: Cost calculation, anonymous request handling, routing, and activity logging now run asynchronously. Offloaded blocking I/O: All synchronous database and Redis calls are now correctly offloaded to thread pools using asyncio.to_thread. Improved Performance: Implemented caching for rate limit checks and optimized Redis metrics recording.

Added missing import asyncio to fix the NameError that would occur at runtime. 2. Implemented Global Velocity Mode in security_middleware.py Added full implementation of the Global Velocity Protection feature

chatgpt-codex-connector · 2026-02-07T00:32:17Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

supabase · 2026-02-07T00:32:19Z

This pull request has been ignored for the connected project ynleroehyrmaafkgjgmr because there are no changes detected in supabase/supabase directory. You can change this behaviour in Project Integrations Settings ↗︎.

Preview Branches by Supabase.
Learn more about Supabase Branching ↗︎.

coderabbitai · 2026-02-07T00:32:30Z

📝 Walkthrough

Walkthrough

This pull request introduces comprehensive DoS protection via a new SecurityMiddleware class with tiered IP rate limiting, behavioral fingerprinting, and velocity mode activation. It integrates the middleware into the app startup with Redis-backed distributed support and in-memory fallback. Additionally, it converts synchronous database and cache operations to asynchronous execution using asyncio.to_thread across multiple routes and services to prevent event loop blocking. A connection error indicator is added to improve resilience.

Changes

Cohort / File(s)	Summary
Security Middleware Infrastructure `src/middleware/security_middleware.py`	New comprehensive DoS protection middleware implementing tiered IP rate limiting (60 RPM standard, 10 RPM datacenter), behavioral fingerprinting via User-Agent/Accept-Language/Encoding, and global velocity mode that reduces all limits by 50% when error rates exceed 10% over 60-second window. Supports Redis-backed distributed state with in-memory fallback, skips internal health/metrics endpoints, and emits structured logs and Prometheus metrics.
Middleware Integration & Connection Resilience `src/main.py`, `src/config/supabase_config.py`	Integrates SecurityMiddleware early in middleware stack with conditional Redis initialization; falls back to in-memory security if Redis unavailable. Adds "bad file descriptor" to connection error detection for improved database connection resilience.
Async Route Conversions `src/routes/admin.py`, `src/routes/api_models.py`, `src/routes/catalog.py`, `src/routes/chat.py`	Converts blocking database and cache calls (create_enhanced_user, get_user, fetch_specific_model, get_cached_providers/models) to async execution via `asyncio.to_thread`. Adds async cost calculation in chat completion paths to maintain non-blocking event loop behavior.
Async Service Conversions `src/services/pricing.py`, `src/services/redis_metrics.py`	Wraps synchronous database/cache lookups and Redis operations (zscore, pipeline execution) with `asyncio.to_thread` to prevent event loop blocking during pricing lookups and health score updates.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant SecurityMiddleware
    participant Redis
    participant App
    
    Client->>SecurityMiddleware: HTTP Request
    
    alt Redis Available
        SecurityMiddleware->>Redis: Check IP rate limit<br/>(default: 60 RPM)
        Redis-->>SecurityMiddleware: Current count
    else No Redis
        SecurityMiddleware->>SecurityMiddleware: Check in-memory<br/>IP limit
    end
    
    alt IP Limit Exceeded
        SecurityMiddleware-->>Client: 429 Too Many Requests
    else IP Limit OK
        SecurityMiddleware->>SecurityMiddleware: Generate behavior<br/>fingerprint
        
        alt Redis Available
            SecurityMiddleware->>Redis: Check fingerprint limit<br/>(100 RPM)
            Redis-->>SecurityMiddleware: Current count
        else No Redis
            SecurityMiddleware->>SecurityMiddleware: Check in-memory<br/>fingerprint limit
        end
        
        alt Fingerprint Limit Exceeded
            SecurityMiddleware-->>Client: 429 Too Many Requests
        else Fingerprint OK
            SecurityMiddleware->>App: Forward Request
            App-->>SecurityMiddleware: Response
            SecurityMiddleware->>SecurityMiddleware: Record outcome<br/>& check velocity mode
            SecurityMiddleware-->>Client: HTTP Response
        end
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Possibly related PRs

3 Layer Fix across the entire affected surface #1058: Overlaps in async model-catalog conversions across src/routes/api_models.py and chat route handling with similar asyncio.to_thread patterns for cached model fetches.

Poem

🐰 A middleware hops in with a hop and a bound,
Rate limits and fingerprints all around!
With Redis for speed and fallbacks so keen,
Our threads now async—no blocking in between!
DoS attacks scatter as fast as we flee! 🚀

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Fix/fix rate limiter' is vague and generic, using a non-descriptive naming pattern that does not clearly convey what the changeset actually accomplishes.	Consider a more specific title such as 'Implement velocity-based rate limiting with async optimizations' or 'Add behavioral DoS protection and async offloading' to better reflect the substantial changes in the PR.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/fix-rate-limiter

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps

_{9 files reviewed, 4 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-07T00:34:53Z

src/main.py

+    # OPTIMIZED: Add security middleware first to block bots before heavy logic
+    from src.config.redis_config import get_async_redis_client
+    try:
+        redis_client = get_async_redis_client()


Broken Redis client import

create_app() imports and calls get_async_redis_client() (from src.config.redis_config import get_async_redis_client), but src/config/redis_config.py only exposes get_redis_client() (no async variant). This will raise ImportError during app startup, preventing the service from booting.

Fix by importing the correct function (or adding the async client factory in redis_config.py and updating callers accordingly).

Prompt To Fix With AI

This is a comment left during a code review. Path: src/main.py Line: 200:203 Comment: **Broken Redis client import** `create_app()` imports and calls `get_async_redis_client()` (`from src.config.redis_config import get_async_redis_client`), but `src/config/redis_config.py` only exposes `get_redis_client()` (no async variant). This will raise `ImportError` during app startup, preventing the service from booting. Fix by importing the correct function (or adding the async client factory in `redis_config.py` and updating callers accordingly). How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-07T00:34:54Z

src/middleware/security_middleware.py

+    async def _get_client_ip(self, request: Request) -> str:
+        """Extract client IP with support for proxies."""
+        forwarded_for = request.headers.get("X-Forwarded-For")
+        if forwarded_for:
+            return forwarded_for.split(",")[0].strip()
+        return request.client.host if request.client else "unknown"


Trusts spoofable client IP

_get_client_ip() unconditionally trusts X-Forwarded-For and uses the first hop as the client IP. If this service is reachable directly (or behind a proxy that doesn’t strip/overwrite XFF), any client can spoof their IP to bypass per-IP limits and also poison the Redis keyspace with arbitrary ip:* buckets.

This should only honor XFF when you can verify the request came through a trusted proxy (or rely on Starlette/FastAPI proxy header middleware configured with known proxy IPs).

Prompt To Fix With AI

This is a comment left during a code review. Path: src/middleware/security_middleware.py Line: 67:72 Comment: **Trusts spoofable client IP** `_get_client_ip()` unconditionally trusts `X-Forwarded-For` and uses the first hop as the client IP. If this service is reachable directly (or behind a proxy that doesn’t strip/overwrite XFF), any client can spoof their IP to bypass per-IP limits and also poison the Redis keyspace with arbitrary `ip:*` buckets. This should only honor XFF when you can verify the request came through a trusted proxy (or rely on Starlette/FastAPI proxy header middleware configured with known proxy IPs). How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-07T00:34:55Z

src/middleware/security_middleware.py

+        # Local in-memory fallback
+        if full_key not in self._local_cache:
+            self._local_cache[full_key] = 0
+            # Periodically clean old keys
+            if now - self._last_cleanup > 300:
+                self._local_cache = {k: v for k, v in self._local_cache.items() if k.startswith(f"sec_rl:{key}:")}
+                self._last_cleanup = now
+
+        self._local_cache[full_key] += 1
+        return self._local_cache[full_key] <= limit


Local cache cleanup bug

In the local fallback, the periodic cleanup filters with k.startswith(f"sec_rl:{key}:"), which only retains keys for the current key and drops all other rate-limit buckets. When cleanup runs, it effectively resets limits for every other IP/fingerprint, allowing bursts until those buckets refill.

Cleanup should remove expired buckets across the whole cache (e.g., based on bucket timestamp), not wipe unrelated keys.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/middleware/security_middleware.py Line: 193:202 Comment: **Local cache cleanup bug** In the local fallback, the periodic cleanup filters with `k.startswith(f"sec_rl:{key}:")`, which only retains keys for the *current* `key` and drops all other rate-limit buckets. When cleanup runs, it effectively resets limits for every other IP/fingerprint, allowing bursts until those buckets refill. Cleanup should remove expired buckets across the whole cache (e.g., based on bucket timestamp), not wipe unrelated keys. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-07T00:34:56Z

src/services/redis_metrics.py

+            # Execute all operations atomically in a thread to avoid blocking the event loop
+            await asyncio.to_thread(pipe.execute)



Threaded pipeline breaks safety

record_request() builds a Redis pipe in the event loop thread, then executes it in a worker thread via await asyncio.to_thread(pipe.execute). The underlying redis client/connection objects (and pipeline state) are not guaranteed to be thread-safe when moved across threads; this can cause intermittent failures or corrupt pipeline execution under concurrency.

If you need to avoid blocking the loop, keep all redis interactions on the same thread (either use an async redis client for metrics, or run the entire metrics write path in to_thread, including pipeline creation).

Prompt To Fix With AI

This is a comment left during a code review. Path: src/services/redis_metrics.py Line: 156:158 Comment: **Threaded pipeline breaks safety** `record_request()` builds a Redis `pipe` in the event loop thread, then executes it in a worker thread via `await asyncio.to_thread(pipe.execute)`. The underlying redis client/connection objects (and pipeline state) are not guaranteed to be thread-safe when moved across threads; this can cause intermittent failures or corrupt pipeline execution under concurrency. If you need to avoid blocking the loop, keep all redis interactions on the same thread (either use an async redis client for metrics, or run the entire metrics write path in `to_thread`, including pipeline creation). How can I resolve this? If you propose a fix, please make it concise.

sentry · 2026-02-07T00:35:17Z

src/middleware/security_middleware.py

+                count = await self.redis.incr(full_key)
+                if count == 1:
+                    await self.redis.expire(full_key, window * 2)
+                return count <= limit


Bug: The security middleware attempts to await synchronous Redis methods. An ImportError for a non-existent async client function is also silently caught, preventing Redis integration.
_{Severity: CRITICAL}

Suggested Fix

Create a get_async_redis_client function that returns a true async Redis client. Alternatively, if using the synchronous client is intended, wrap all Redis calls within the middleware (e.g., self.redis.incr) with asyncio.to_thread() to prevent TypeError exceptions. The exception handling in main.py should also be made more specific to avoid masking import errors.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: src/middleware/security_middleware.py#L185-L188 Potential issue: The `SecurityMiddleware` is configured to use an async Redis client, but the `get_async_redis_client` function it tries to import from `src/config/redis_config.py` does not exist. This `ImportError` is silently caught by a broad `except Exception` block, causing the middleware to fall back to in-memory mode without warning. If this import were fixed to use the available synchronous `get_redis_client`, the application would crash at runtime. The middleware uses `await` on synchronous methods like `self.redis.incr()`, which are not awaitable and will raise a `TypeError`.

_{Did we get this right? 👍 / 👎 to inform future reviews.}

src/routes/chat.py

 import src.services.trial_validation as trial_module
 from src.services.model_transformations import detect_provider_from_model_id, transform_model_id
-from src.services.pricing import calculate_cost
+from src.services.pricing import calculate_cost, calculate_cost_async


src/middleware/security_middleware.py

+import logging
+import time
+from collections import deque
+from typing import Optional


src/middleware/security_middleware.py

+from starlette.responses import JSONResponse
+from starlette.types import ASGIApp
+
+from src.config import Config


src/middleware/security_middleware.py

+            # Record metric for monitoring
+            try:
+                rate_limited_requests.labels(limit_type="velocity_mode_activated").inc()
+            except Exception:


coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

src/services/pricing.py (1)
614-619: ⚠️ Potential issue | 🟠 Major

Race condition: missing _pricing_cache_lock when writing to _pricing_cache.

Step 4 (cache fallback) writes to _pricing_cache without acquiring the lock, unlike Step 3 (lines 600–605) and the synchronous get_model_pricing (lines 435–439) which both correctly use with _pricing_cache_lock. This is a data race on the shared dict.
Proposed fix
         cache_pricing = await asyncio.to_thread(_get_pricing_from_cache_fallback, model_id, candidate_ids)
             if cache_pricing:
                 # Cache the fallback result
-                _pricing_cache[model_id] = {
-                    "data": cache_pricing,
-                    "timestamp": time.time()
-                }
+                with _pricing_cache_lock:
+                    _pricing_cache[model_id] = {
+                        "data": cache_pricing,
+                        "timestamp": time.time()
+                    }
                 return cache_pricing
src/routes/admin.py (2)
114-119: ⚠️ Potential issue | 🟡 Minor

updated_user can be None, causing a TypeError on line 119.

get_user returns None when the user is not found. If the user is deleted between the add_credits_to_user call and the re-fetch, updated_user["credits"] will raise TypeError: 'NoneType' object is not subscriptable.
Suggested fix
         updated_user = await asyncio.to_thread(get_user, req.api_key)
+        if not updated_user:
+            raise HTTPException(status_code=404, detail="User not found after credit update")

         return {
             "status": "success",
188-209: ⚠️ Potential issue | 🟠 Major

Use asyncio.to_thread() to wrap the synchronous get_user_rate_limits call.

get_user_rate_limits (line 193) is a synchronous function that performs blocking Supabase database calls, but it's being called without await or asyncio.to_thread() in this async handler. This blocks the event loop and violates the guideline to use async/await for all I/O operations. The pattern already exists correctly in set_user_rate_limits (line 191), which wraps its synchronous operations with asyncio.to_thread().
Suggested fix
         await set_user_rate_limits(req.api_key, req.rate_limits.model_dump())

-        rate_limits = get_user_rate_limits(req.api_key)
+        rate_limits = await asyncio.to_thread(get_user_rate_limits, req.api_key)

🤖 Fix all issues with AI agents

In `@src/main.py`:
- Around line 200-209: The import of get_async_redis_client is incorrect —
replace the import and usage with the synchronous get_redis_client from the
redis_config module: change the imported symbol get_async_redis_client to
get_redis_client and call get_redis_client() to obtain redis_client before
calling app.add_middleware(SecurityMiddleware, redis_client=redis_client); keep
the existing try/except fallback that calls
app.add_middleware(SecurityMiddleware) when redis_client is None or an exception
occurs so the middleware initialization doesn't crash.

In `@src/middleware/security_middleware.py`:
- Around line 52-64: The middleware currently keeps state in-memory in __init__
via _request_log, _local_cache, _velocity_mode_until and so will diverge across
workers; refactor to make the middleware stateless by moving shared counters and
velocity state into Redis (use the redis_client parameter) or a dedicated
service class (e.g., VelocityService) and delegate all business logic
(sliding-window tracking, fallback limits, velocity-mode toggling) to that
service; add concrete type hints for redis_client, _local_cache, _request_log,
_velocity_mode_until, and _velocity_mode_triggered_count using Python 3.12+
typing syntax and ensure middleware only performs non-blocking calls to the
service (no in-process state mutations).
- Around line 193-199: The cleanup currently filters _local_cache to only keys
with the current prefix (full_key/key) which deletes counters for other clients;
change the strategy to evict stale entries by age instead: modify _local_cache
storage (or add a companion dict) so each entry records a last_seen timestamp
(e.g., store (count, last_seen) or maintain _local_cache_timestamps), update the
increment logic in the same code path to set last_seen = now when touching
full_key, and replace the cleanup block that references _last_cleanup with a
loop that removes entries whose last_seen < now - 300 (rather than keeping only
keys starting with f"sec_rl:{key}:"); keep using _last_cleanup to avoid running
the sweep too often.
- Around line 183-191: In the Redis exception handler inside the rate limiting
logic (the try/except that uses self.redis.incr(full_key) and returns count <=
limit) add a Sentry capture by importing capture_cache_error from
src.utils.auto_sentry and calling it with the caught exception,
operation="rate_limit_check", cache_type="redis", key=full_key, and details
containing {"limit": limit, "endpoint": key}; keep the existing logger.error
call and then call capture_cache_error(e, operation=..., cache_type=...,
key=full_key, details=...) before falling back to local limiting.

🧹 Nitpick comments (11)

src/routes/api_models.py (1)
116-116: get_cached_providers() is still called synchronously in this async handler.

Lines 100 and 148 were correctly wrapped with asyncio.to_thread, but get_cached_providers() on line 116 remains a synchronous blocking call in the same async handler. For consistency and to fully prevent event loop blocking, this should also be offloaded.
Proposed fix
-        openrouter_providers = get_cached_providers()
+        openrouter_providers = await asyncio.to_thread(get_cached_providers)
src/services/redis_metrics.py (1)

198-203: Other async methods still perform synchronous Redis I/O directly.

get_provider_health (line 199), get_recent_errors (line 221), get_hourly_stats (line 255), get_latency_percentiles (line 295), update_circuit_breaker (line 344), get_all_provider_health (line 362), and cleanup_old_data (lines 397/406) all call self.redis.* synchronously within async methods. While not part of this PR's changes, they'll block the event loop the same way record_request used to. Consider wrapping these in a follow-up.

src/routes/catalog.py (2)

989-1149: Individual provider fetches in get_models remain synchronous.

When gateway != "all" (or when the aggregated cache is empty), lines 992–1149 call get_cached_models(...) synchronously ~25 times. These are the same blocking calls that were wrapped with asyncio.to_thread on lines 563, 761, 782, and 952. Under load with a single-gateway request, these will still block the event loop.

Consider wrapping these in a follow-up pass for consistency—or, since they're behind a not (gateway_value == "all" and all_models_list) guard, document that the hot path (gateway=all) is covered and these are low-traffic fallbacks.

1610-1610: get_cached_providers wrapped, but get_cached_models on lines 1628 and 1649 are not.

In get_specific_model, get_cached_providers was wrapped in asyncio.to_thread (line 1610), but the two get_cached_models calls on lines 1628 and 1649 remain synchronous. This is inconsistent within the same handler.
src/routes/admin.py (6)
59-68: send_welcome_email is a synchronous call that can block the event loop.

This was not converted as part of this PR, but it sits on the same request path as the newly-async create_enhanced_user. If the email service is slow or times out, the event loop stalls.
Suggested fix
         try:
-            enhanced_notification_service.send_welcome_email(
+            await asyncio.to_thread(
+                enhanced_notification_service.send_welcome_email,
                 user_id=user_data["user_id"],
                 username=user_data["username"],
                 email=user_data["email"],
                 credits=user_data["credits"],
             )
316-318: fetch_huggingface_model is a synchronous HTTP call not offloaded to a thread.

This is an external HTTP fetch that can block the event loop, inconsistent with the other conversions in this PR.
Suggested fix
-        hf_data = fetch_huggingface_model(hugging_face_id)
+        hf_data = await asyncio.to_thread(fetch_huggingface_model, hugging_face_id)
462-470: get_trial_analytics is a synchronous DB call not offloaded.

Same pattern as the other conversions in this PR — should be wrapped for consistency. As per coding guidelines, **/*.py: "Use async/await for all I/O operations (database calls, HTTP requests, file operations) to maintain non-blocking execution and high concurrency."
Suggested fix
-        analytics = get_trial_analytics()
+        analytics = await asyncio.to_thread(get_trial_analytics)
1155-1171: get_all_transactions and get_transaction_summary (line 1216) are synchronous DB calls not offloaded.

These are blocking I/O operations in an async handler, consistent with the pattern this PR is fixing elsewhere. As per coding guidelines: "Use async/await for all I/O operations."
Suggested fix
-        transactions = get_all_transactions(
+        transactions = await asyncio.to_thread(
+            get_all_transactions,
             limit=limit,
             ...
         )
And similarly for get_transaction_summary at line 1216.
2267-2273: Redis get call is synchronous, blocking the event loop.

In get_chat_requests_summary_admin, the Redis client calls (redis_client.get, redis_client.setex) on lines 2267 and 2304 are synchronous. Since this PR's goal is to offload blocking I/O, these should also be wrapped or use an async Redis client.
Suggested fix
-                cached_data = redis_client.get(cache_key)
+                cached_data = await asyncio.to_thread(redis_client.get, cache_key)
-                redis_client.setex(
+                await asyncio.to_thread(
+                    redis_client.setex,
                     cache_key,
                     60,
                     json.dumps(response, default=str)
                 )
2279-2285: get_chat_completion_summary_by_filters is a synchronous DB call not offloaded.
Suggested fix
-        summary = get_chat_completion_summary_by_filters(
+        summary = await asyncio.to_thread(
+            get_chat_completion_summary_by_filters,
             model_id=model_id,
             ...
         )
src/routes/chat.py (1)
1073-1125: Avoid duplicate cost computation in anonymous streaming background path

calculate_cost_async is already awaited for tracing; the anonymous branch recomputes it again for metrics. Reuse the earlier cost to avoid extra pricing lookups and latency.
♻️ Suggested diff
-                cost = await calculate_cost_async(model, prompt_tokens, completion_tokens)
                 await _record_inference_metrics_and_health(
                     provider=provider,
                     model=model,
                     elapsed_seconds=elapsed,
                     prompt_tokens=prompt_tokens,
                     completion_tokens=completion_tokens,
                     cost=cost,
                     success=True,
                     error_message=None,
                 )

coderabbitai · 2026-02-07T00:41:54Z

src/main.py

+    # OPTIMIZED: Add security middleware first to block bots before heavy logic
+    from src.config.redis_config import get_async_redis_client
+    try:
+        redis_client = get_async_redis_client()
+        app.add_middleware(SecurityMiddleware, redis_client=redis_client)
+        logger.info("  🛡️  Security middleware enabled (IP tiering & fingerprinting)")
+    except Exception as e:
+        # Fallback to in-memory limiting if redis is unavailable
+        app.add_middleware(SecurityMiddleware)
+        logger.warning(f"  🛡️  Security middleware enabled with LOCAL fallback (Redis error: {e})")


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check the signature and implementation of get_async_redis_client rg -n "async def get_async_redis_client|def get_async_redis_client" src/config/redis_config.py -A 10

Repository: Alpaca-Network/gatewayz-backend

Length of output: 57

🏁 Script executed:

#!/bin/bash # Find redis_config.py and check its location find . -name "redis_config.py" -type f 2>/dev/null # Also search for get_async_redis_client across the codebase rg "get_async_redis_client" -n 2>/dev/null | head -20 # List the config directory structure ls -la src/config/ 2>/dev/null || echo "src/config/ not found"

Repository: Alpaca-Network/gatewayz-backend

Length of output: 1059

🏁 Script executed:

#!/bin/bash # Read the redis_config.py file cat -n src/config/redis_config.py

Repository: Alpaca-Network/gatewayz-backend

Length of output: 12547

🏁 Script executed:

#!/bin/bash # Check the exact import in src/main.py around lines 200-209 sed -n '195,215p' src/main.py

Repository: Alpaca-Network/gatewayz-backend

Length of output: 1063

🏁 Script executed:

#!/bin/bash # Check if there's another redis_config.py in src/ cat -n src/redis_config.py 2>/dev/null || echo "File does not exist or is empty"

Repository: Alpaca-Network/gatewayz-backend

Length of output: 3514

Fix incorrect import: get_async_redis_client does not exist

The function being imported at line 201 does not exist in src.config.redis_config. The module only provides get_redis_client() (a synchronous function returning redis.Redis | None). Change the import to:

from src.config.redis_config import get_redis_client

And update line 203:

redis_client = get_redis_client()

This will raise an ImportError at runtime and cause the entire middleware initialization to fail before the exception handler can execute.

🤖 Prompt for AI Agents

In `@src/main.py` around lines 200 - 209, The import of get_async_redis_client is incorrect — replace the import and usage with the synchronous get_redis_client from the redis_config module: change the imported symbol get_async_redis_client to get_redis_client and call get_redis_client() to obtain redis_client before calling app.add_middleware(SecurityMiddleware, redis_client=redis_client); keep the existing try/except fallback that calls app.add_middleware(SecurityMiddleware) when redis_client is None or an exception occurs so the middleware initialization doesn't crash.

coderabbitai · 2026-02-07T00:41:55Z

src/middleware/security_middleware.py

+    def __init__(self, app: ASGIApp, redis_client=None):
+        super().__init__(app)
+        self.redis = redis_client
+        # In-memory fallback if redis is missing
+        self._local_cache = {}
+        self._last_cleanup = time.time()
+
+        # Global Velocity Mode tracking
+        # Tracks (timestamp, is_error) tuples for sliding window analysis
+        self._request_log: deque = deque(maxlen=10000)  # Cap memory usage
+        self._velocity_mode_until: float = 0  # Unix timestamp when velocity mode expires
+        self._velocity_mode_triggered_count: int = 0  # For metrics
+


⚠️ Potential issue | 🟠 Major

Stateful in‑memory tracking makes velocity mode per‑process

The middleware keeps _request_log, _local_cache, and _velocity_mode_until in memory, so “global” velocity mode and fallback limits will diverge across workers. Consider moving these counters/state into Redis (or a dedicated service) and annotating the stored fields/redis client with concrete types.
As per coding guidelines: Middleware should be stateless, non-blocking, and focus on cross-cutting concerns (logging, authentication, rate limiting, tracing). Avoid business logic in middleware; delegate to services instead; and use type hints throughout all Python code (Function parameters, return types, class attributes) targeting Python 3.12+ syntax.

🤖 Prompt for AI Agents

In `@src/middleware/security_middleware.py` around lines 52 - 64, The middleware currently keeps state in-memory in __init__ via _request_log, _local_cache, _velocity_mode_until and so will diverge across workers; refactor to make the middleware stateless by moving shared counters and velocity state into Redis (use the redis_client parameter) or a dedicated service class (e.g., VelocityService) and delegate all business logic (sliding-window tracking, fallback limits, velocity-mode toggling) to that service; add concrete type hints for redis_client, _local_cache, _request_log, _velocity_mode_until, and _velocity_mode_triggered_count using Python 3.12+ typing syntax and ensure middleware only performs non-blocking calls to the service (no in-process state mutations).

coderabbitai · 2026-02-07T00:41:55Z

src/middleware/security_middleware.py

+            try:
+                # Use Redis for distributed limiting
+                count = await self.redis.incr(full_key)
+                if count == 1:
+                    await self.redis.expire(full_key, window * 2)
+                return count <= limit
+            except Exception as e:
+                logger.error(f"Redis security limit error: {e}")
+                # Fallback to local


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Check if auto_sentry.py exists and inspect its contents if [ -f "src/utils/auto_sentry.py" ]; then echo "=== auto_sentry.py exists ===" wc -l src/utils/auto_sentry.py echo "" echo "=== Function signatures in auto_sentry.py ===" rg -n "^def |^async def" src/utils/auto_sentry.py else echo "auto_sentry.py NOT found" fi

Repository: Alpaca-Network/gatewayz-backend

Length of output: 786

🏁 Script executed:

# Inspect the security_middleware.py file around the specified lines if [ -f "src/middleware/security_middleware.py" ]; then echo "=== Lines 175-200 from security_middleware.py ===" sed -n '175,200p' src/middleware/security_middleware.py echo "" echo "=== Search for Sentry imports in security_middleware.py ===" rg -n "sentry|Sentry" src/middleware/security_middleware.py else echo "security_middleware.py NOT found" fi

Repository: Alpaca-Network/gatewayz-backend

Length of output: 1228

🏁 Script executed:

# Inspect the auto_capture_errors decorator and _auto_capture_exception function echo "=== auto_capture_errors decorator (lines 54-135) ===" sed -n '54,135p' src/utils/auto_sentry.py

Repository: Alpaca-Network/gatewayz-backend

Length of output: 3084

🏁 Script executed:

# Inspect _auto_capture_exception function echo "=== _auto_capture_exception function ===" sed -n '136,237p' src/utils/auto_sentry.py

Repository: Alpaca-Network/gatewayz-backend

Length of output: 3803

🏁 Script executed:

# Search for how Sentry is used elsewhere in middleware and services echo "=== Sentry usage patterns in middleware ===" rg -n "sentry|capture" src/middleware/ -A 2 | head -40 echo "" echo "=== Sentry usage patterns in services ===" rg -n "capture_exception|_auto_capture_exception" src/services/ -A 2 | head -40

Repository: Alpaca-Network/gatewayz-backend

Length of output: 3508

Add Sentry capture for Redis rate limiter failures

The Redis exception handler only logs the error. Add Sentry capture with the rate limit key and threshold as context so limiter outages are visible in monitoring. Use capture_cache_error() from src/utils/auto_sentry.py to track the cache operation, key, and exception:

Example:

except Exception as e: logger.error(f"Redis security limit error: {e}") from src.utils.auto_sentry import capture_cache_error capture_cache_error( e, operation="rate_limit_check", cache_type="redis", key=full_key, details={"limit": limit, "endpoint": key} )

🤖 Prompt for AI Agents

In `@src/middleware/security_middleware.py` around lines 183 - 191, In the Redis exception handler inside the rate limiting logic (the try/except that uses self.redis.incr(full_key) and returns count <= limit) add a Sentry capture by importing capture_cache_error from src.utils.auto_sentry and calling it with the caught exception, operation="rate_limit_check", cache_type="redis", key=full_key, and details containing {"limit": limit, "endpoint": key}; keep the existing logger.error call and then call capture_cache_error(e, operation=..., cache_type=..., key=full_key, details=...) before falling back to local limiting.

coderabbitai · 2026-02-07T00:41:55Z

src/middleware/security_middleware.py

+        # Local in-memory fallback
+        if full_key not in self._local_cache:
+            self._local_cache[full_key] = 0
+            # Periodically clean old keys
+            if now - self._last_cleanup > 300:
+                self._local_cache = {k: v for k, v in self._local_cache.items() if k.startswith(f"sec_rl:{key}:")}
+                self._last_cleanup = now


⚠️ Potential issue | 🟠 Major

Local fallback cleanup drops unrelated keys, weakening rate limits

The cleanup keeps only keys matching the current key prefix, which wipes counters for other IPs/fingerprints every cleanup cycle. That effectively resets their limits and weakens protection.

🐛 Suggested fix

- if now - self._last_cleanup > 300: - self._local_cache = {k: v for k, v in self._local_cache.items() if k.startswith(f"sec_rl:{key}:")} - self._last_cleanup = now + if now - self._last_cleanup > 300: + current_bucket = bucket + self._local_cache = { + k: v + for k, v in self._local_cache.items() + if k.rsplit(":", 1)[-1].isdigit() + and int(k.rsplit(":", 1)[-1]) >= current_bucket - 1 + } + self._last_cleanup = now

🤖 Prompt for AI Agents

In `@src/middleware/security_middleware.py` around lines 193 - 199, The cleanup currently filters _local_cache to only keys with the current prefix (full_key/key) which deletes counters for other clients; change the strategy to evict stale entries by age instead: modify _local_cache storage (or add a companion dict) so each entry records a last_seen timestamp (e.g., store (count, last_seen) or maintain _local_cache_timestamps), update the increment logic in the same code path to set last_seen = now when touching full_key, and replace the cleanup block that references _last_cleanup with a loop that removes entries whose last_seen < now - 300 (rather than keeping only keys starting with f"sec_rl:{key}:"); keep using _last_cleanup to avoid running the sweep too often.

Jay-tech456 added 2 commits February 6, 2026 16:20

1. Fixed Critical Bug in redis_metrics.py

ef269dd

Added missing import asyncio to fix the NameError that would occur at runtime. 2. Implemented Global Velocity Mode in security_middleware.py Added full implementation of the Global Velocity Protection feature

greptile-apps bot reviewed Feb 7, 2026

View reviewed changes

sentry bot reviewed Feb 7, 2026

View reviewed changes

github-code-quality bot found potential problems Feb 7, 2026

View reviewed changes

Jay-tech456 closed this Feb 7, 2026

coderabbitai bot reviewed Feb 7, 2026

View reviewed changes

		# Execute all operations atomically in a thread to avoid blocking the event loop
		await asyncio.to_thread(pipe.execute)

Conversation

Jay-tech456 commented Feb 7, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Greptile Overview

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Sequence Diagram

Uh oh!

chatgpt-codex-connector bot commented Feb 7, 2026

Uh oh!

supabase bot commented Feb 7, 2026

Uh oh!

coderabbitai bot commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

sentry bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Jay-tech456 commented Feb 7, 2026 •

edited by greptile-apps bot

Loading

coderabbitai bot commented Feb 7, 2026 •

edited

Loading