Conversation
…ath. Key improvements include: Async-ified all critical operations: Cost calculation, anonymous request handling, routing, and activity logging now run asynchronously. Offloaded blocking I/O: All synchronous database and Redis calls are now correctly offloaded to thread pools using asyncio.to_thread. Improved Performance: Implemented caching for rate limit checks and optimized Redis metrics recording.
Added missing import asyncio to fix the NameError that would occur at runtime. 2. Implemented Global Velocity Mode in security_middleware.py Added full implementation of the Global Velocity Protection feature
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
This pull request has been ignored for the connected project Preview Branches by Supabase. |
📝 WalkthroughWalkthroughThis pull request introduces comprehensive DoS protection via a new SecurityMiddleware class with tiered IP rate limiting, behavioral fingerprinting, and velocity mode activation. It integrates the middleware into the app startup with Redis-backed distributed support and in-memory fallback. Additionally, it converts synchronous database and cache operations to asynchronous execution using Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant SecurityMiddleware
participant Redis
participant App
Client->>SecurityMiddleware: HTTP Request
alt Redis Available
SecurityMiddleware->>Redis: Check IP rate limit<br/>(default: 60 RPM)
Redis-->>SecurityMiddleware: Current count
else No Redis
SecurityMiddleware->>SecurityMiddleware: Check in-memory<br/>IP limit
end
alt IP Limit Exceeded
SecurityMiddleware-->>Client: 429 Too Many Requests
else IP Limit OK
SecurityMiddleware->>SecurityMiddleware: Generate behavior<br/>fingerprint
alt Redis Available
SecurityMiddleware->>Redis: Check fingerprint limit<br/>(100 RPM)
Redis-->>SecurityMiddleware: Current count
else No Redis
SecurityMiddleware->>SecurityMiddleware: Check in-memory<br/>fingerprint limit
end
alt Fingerprint Limit Exceeded
SecurityMiddleware-->>Client: 429 Too Many Requests
else Fingerprint OK
SecurityMiddleware->>App: Forward Request
App-->>SecurityMiddleware: Response
SecurityMiddleware->>SecurityMiddleware: Record outcome<br/>& check velocity mode
SecurityMiddleware-->>Client: HTTP Response
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~65 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| # OPTIMIZED: Add security middleware first to block bots before heavy logic | ||
| from src.config.redis_config import get_async_redis_client | ||
| try: | ||
| redis_client = get_async_redis_client() |
There was a problem hiding this comment.
Broken Redis client import
create_app() imports and calls get_async_redis_client() (from src.config.redis_config import get_async_redis_client), but src/config/redis_config.py only exposes get_redis_client() (no async variant). This will raise ImportError during app startup, preventing the service from booting.
Fix by importing the correct function (or adding the async client factory in redis_config.py and updating callers accordingly).
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/main.py
Line: 200:203
Comment:
**Broken Redis client import**
`create_app()` imports and calls `get_async_redis_client()` (`from src.config.redis_config import get_async_redis_client`), but `src/config/redis_config.py` only exposes `get_redis_client()` (no async variant). This will raise `ImportError` during app startup, preventing the service from booting.
Fix by importing the correct function (or adding the async client factory in `redis_config.py` and updating callers accordingly).
How can I resolve this? If you propose a fix, please make it concise.| async def _get_client_ip(self, request: Request) -> str: | ||
| """Extract client IP with support for proxies.""" | ||
| forwarded_for = request.headers.get("X-Forwarded-For") | ||
| if forwarded_for: | ||
| return forwarded_for.split(",")[0].strip() | ||
| return request.client.host if request.client else "unknown" |
There was a problem hiding this comment.
Trusts spoofable client IP
_get_client_ip() unconditionally trusts X-Forwarded-For and uses the first hop as the client IP. If this service is reachable directly (or behind a proxy that doesn’t strip/overwrite XFF), any client can spoof their IP to bypass per-IP limits and also poison the Redis keyspace with arbitrary ip:* buckets.
This should only honor XFF when you can verify the request came through a trusted proxy (or rely on Starlette/FastAPI proxy header middleware configured with known proxy IPs).
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/middleware/security_middleware.py
Line: 67:72
Comment:
**Trusts spoofable client IP**
`_get_client_ip()` unconditionally trusts `X-Forwarded-For` and uses the first hop as the client IP. If this service is reachable directly (or behind a proxy that doesn’t strip/overwrite XFF), any client can spoof their IP to bypass per-IP limits and also poison the Redis keyspace with arbitrary `ip:*` buckets.
This should only honor XFF when you can verify the request came through a trusted proxy (or rely on Starlette/FastAPI proxy header middleware configured with known proxy IPs).
How can I resolve this? If you propose a fix, please make it concise.| # Local in-memory fallback | ||
| if full_key not in self._local_cache: | ||
| self._local_cache[full_key] = 0 | ||
| # Periodically clean old keys | ||
| if now - self._last_cleanup > 300: | ||
| self._local_cache = {k: v for k, v in self._local_cache.items() if k.startswith(f"sec_rl:{key}:")} | ||
| self._last_cleanup = now | ||
|
|
||
| self._local_cache[full_key] += 1 | ||
| return self._local_cache[full_key] <= limit |
There was a problem hiding this comment.
Local cache cleanup bug
In the local fallback, the periodic cleanup filters with k.startswith(f"sec_rl:{key}:"), which only retains keys for the current key and drops all other rate-limit buckets. When cleanup runs, it effectively resets limits for every other IP/fingerprint, allowing bursts until those buckets refill.
Cleanup should remove expired buckets across the whole cache (e.g., based on bucket timestamp), not wipe unrelated keys.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/middleware/security_middleware.py
Line: 193:202
Comment:
**Local cache cleanup bug**
In the local fallback, the periodic cleanup filters with `k.startswith(f"sec_rl:{key}:")`, which only retains keys for the *current* `key` and drops all other rate-limit buckets. When cleanup runs, it effectively resets limits for every other IP/fingerprint, allowing bursts until those buckets refill.
Cleanup should remove expired buckets across the whole cache (e.g., based on bucket timestamp), not wipe unrelated keys.
How can I resolve this? If you propose a fix, please make it concise.| # Execute all operations atomically in a thread to avoid blocking the event loop | ||
| await asyncio.to_thread(pipe.execute) | ||
|
|
There was a problem hiding this comment.
Threaded pipeline breaks safety
record_request() builds a Redis pipe in the event loop thread, then executes it in a worker thread via await asyncio.to_thread(pipe.execute). The underlying redis client/connection objects (and pipeline state) are not guaranteed to be thread-safe when moved across threads; this can cause intermittent failures or corrupt pipeline execution under concurrency.
If you need to avoid blocking the loop, keep all redis interactions on the same thread (either use an async redis client for metrics, or run the entire metrics write path in to_thread, including pipeline creation).
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/services/redis_metrics.py
Line: 156:158
Comment:
**Threaded pipeline breaks safety**
`record_request()` builds a Redis `pipe` in the event loop thread, then executes it in a worker thread via `await asyncio.to_thread(pipe.execute)`. The underlying redis client/connection objects (and pipeline state) are not guaranteed to be thread-safe when moved across threads; this can cause intermittent failures or corrupt pipeline execution under concurrency.
If you need to avoid blocking the loop, keep all redis interactions on the same thread (either use an async redis client for metrics, or run the entire metrics write path in `to_thread`, including pipeline creation).
How can I resolve this? If you propose a fix, please make it concise.| count = await self.redis.incr(full_key) | ||
| if count == 1: | ||
| await self.redis.expire(full_key, window * 2) | ||
| return count <= limit |
There was a problem hiding this comment.
Bug: The security middleware attempts to await synchronous Redis methods. An ImportError for a non-existent async client function is also silently caught, preventing Redis integration.
Severity: CRITICAL
Suggested Fix
Create a get_async_redis_client function that returns a true async Redis client. Alternatively, if using the synchronous client is intended, wrap all Redis calls within the middleware (e.g., self.redis.incr) with asyncio.to_thread() to prevent TypeError exceptions. The exception handling in main.py should also be made more specific to avoid masking import errors.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/middleware/security_middleware.py#L185-L188
Potential issue: The `SecurityMiddleware` is configured to use an async Redis client,
but the `get_async_redis_client` function it tries to import from
`src/config/redis_config.py` does not exist. This `ImportError` is silently caught by a
broad `except Exception` block, causing the middleware to fall back to in-memory mode
without warning. If this import were fixed to use the available synchronous
`get_redis_client`, the application would crash at runtime. The middleware uses `await`
on synchronous methods like `self.redis.incr()`, which are not awaitable and will raise
a `TypeError`.
Did we get this right? 👍 / 👎 to inform future reviews.
| import src.services.trial_validation as trial_module | ||
| from src.services.model_transformations import detect_provider_from_model_id, transform_model_id | ||
| from src.services.pricing import calculate_cost | ||
| from src.services.pricing import calculate_cost, calculate_cost_async |
| import logging | ||
| import time | ||
| from collections import deque | ||
| from typing import Optional |
| from starlette.responses import JSONResponse | ||
| from starlette.types import ASGIApp | ||
|
|
||
| from src.config import Config |
| # Record metric for monitoring | ||
| try: | ||
| rate_limited_requests.labels(limit_type="velocity_mode_activated").inc() | ||
| except Exception: |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/services/pricing.py (1)
614-619:⚠️ Potential issue | 🟠 MajorRace condition: missing
_pricing_cache_lockwhen writing to_pricing_cache.Step 4 (cache fallback) writes to
_pricing_cachewithout acquiring the lock, unlike Step 3 (lines 600–605) and the synchronousget_model_pricing(lines 435–439) which both correctly usewith _pricing_cache_lock. This is a data race on the shared dict.Proposed fix
cache_pricing = await asyncio.to_thread(_get_pricing_from_cache_fallback, model_id, candidate_ids) if cache_pricing: # Cache the fallback result - _pricing_cache[model_id] = { - "data": cache_pricing, - "timestamp": time.time() - } + with _pricing_cache_lock: + _pricing_cache[model_id] = { + "data": cache_pricing, + "timestamp": time.time() + } return cache_pricingsrc/routes/admin.py (2)
114-119:⚠️ Potential issue | 🟡 Minor
updated_usercan beNone, causing aTypeErroron line 119.
get_userreturnsNonewhen the user is not found. If the user is deleted between theadd_credits_to_usercall and the re-fetch,updated_user["credits"]will raiseTypeError: 'NoneType' object is not subscriptable.Suggested fix
updated_user = await asyncio.to_thread(get_user, req.api_key) + if not updated_user: + raise HTTPException(status_code=404, detail="User not found after credit update") return { "status": "success",
188-209:⚠️ Potential issue | 🟠 MajorUse
asyncio.to_thread()to wrap the synchronousget_user_rate_limitscall.
get_user_rate_limits(line 193) is a synchronous function that performs blocking Supabase database calls, but it's being called withoutawaitorasyncio.to_thread()in this async handler. This blocks the event loop and violates the guideline to use async/await for all I/O operations. The pattern already exists correctly inset_user_rate_limits(line 191), which wraps its synchronous operations withasyncio.to_thread().Suggested fix
await set_user_rate_limits(req.api_key, req.rate_limits.model_dump()) - rate_limits = get_user_rate_limits(req.api_key) + rate_limits = await asyncio.to_thread(get_user_rate_limits, req.api_key)
🤖 Fix all issues with AI agents
In `@src/main.py`:
- Around line 200-209: The import of get_async_redis_client is incorrect —
replace the import and usage with the synchronous get_redis_client from the
redis_config module: change the imported symbol get_async_redis_client to
get_redis_client and call get_redis_client() to obtain redis_client before
calling app.add_middleware(SecurityMiddleware, redis_client=redis_client); keep
the existing try/except fallback that calls
app.add_middleware(SecurityMiddleware) when redis_client is None or an exception
occurs so the middleware initialization doesn't crash.
In `@src/middleware/security_middleware.py`:
- Around line 52-64: The middleware currently keeps state in-memory in __init__
via _request_log, _local_cache, _velocity_mode_until and so will diverge across
workers; refactor to make the middleware stateless by moving shared counters and
velocity state into Redis (use the redis_client parameter) or a dedicated
service class (e.g., VelocityService) and delegate all business logic
(sliding-window tracking, fallback limits, velocity-mode toggling) to that
service; add concrete type hints for redis_client, _local_cache, _request_log,
_velocity_mode_until, and _velocity_mode_triggered_count using Python 3.12+
typing syntax and ensure middleware only performs non-blocking calls to the
service (no in-process state mutations).
- Around line 193-199: The cleanup currently filters _local_cache to only keys
with the current prefix (full_key/key) which deletes counters for other clients;
change the strategy to evict stale entries by age instead: modify _local_cache
storage (or add a companion dict) so each entry records a last_seen timestamp
(e.g., store (count, last_seen) or maintain _local_cache_timestamps), update the
increment logic in the same code path to set last_seen = now when touching
full_key, and replace the cleanup block that references _last_cleanup with a
loop that removes entries whose last_seen < now - 300 (rather than keeping only
keys starting with f"sec_rl:{key}:"); keep using _last_cleanup to avoid running
the sweep too often.
- Around line 183-191: In the Redis exception handler inside the rate limiting
logic (the try/except that uses self.redis.incr(full_key) and returns count <=
limit) add a Sentry capture by importing capture_cache_error from
src.utils.auto_sentry and calling it with the caught exception,
operation="rate_limit_check", cache_type="redis", key=full_key, and details
containing {"limit": limit, "endpoint": key}; keep the existing logger.error
call and then call capture_cache_error(e, operation=..., cache_type=...,
key=full_key, details=...) before falling back to local limiting.
🧹 Nitpick comments (11)
src/routes/api_models.py (1)
116-116:get_cached_providers()is still called synchronously in this async handler.Lines 100 and 148 were correctly wrapped with
asyncio.to_thread, butget_cached_providers()on line 116 remains a synchronous blocking call in the same async handler. For consistency and to fully prevent event loop blocking, this should also be offloaded.Proposed fix
- openrouter_providers = get_cached_providers() + openrouter_providers = await asyncio.to_thread(get_cached_providers)src/services/redis_metrics.py (1)
198-203: Other async methods still perform synchronous Redis I/O directly.
get_provider_health(line 199),get_recent_errors(line 221),get_hourly_stats(line 255),get_latency_percentiles(line 295),update_circuit_breaker(line 344),get_all_provider_health(line 362), andcleanup_old_data(lines 397/406) all callself.redis.*synchronously withinasyncmethods. While not part of this PR's changes, they'll block the event loop the same wayrecord_requestused to. Consider wrapping these in a follow-up.src/routes/catalog.py (2)
989-1149: Individual provider fetches inget_modelsremain synchronous.When
gateway != "all"(or when the aggregated cache is empty), lines 992–1149 callget_cached_models(...)synchronously ~25 times. These are the same blocking calls that were wrapped withasyncio.to_threadon lines 563, 761, 782, and 952. Under load with a single-gateway request, these will still block the event loop.Consider wrapping these in a follow-up pass for consistency—or, since they're behind a
not (gateway_value == "all" and all_models_list)guard, document that the hot path (gateway=all) is covered and these are low-traffic fallbacks.
1610-1610:get_cached_providerswrapped, butget_cached_modelson lines 1628 and 1649 are not.In
get_specific_model,get_cached_providerswas wrapped inasyncio.to_thread(line 1610), but the twoget_cached_modelscalls on lines 1628 and 1649 remain synchronous. This is inconsistent within the same handler.src/routes/admin.py (6)
59-68:send_welcome_emailis a synchronous call that can block the event loop.This was not converted as part of this PR, but it sits on the same request path as the newly-async
create_enhanced_user. If the email service is slow or times out, the event loop stalls.Suggested fix
try: - enhanced_notification_service.send_welcome_email( + await asyncio.to_thread( + enhanced_notification_service.send_welcome_email, user_id=user_data["user_id"], username=user_data["username"], email=user_data["email"], credits=user_data["credits"], )
316-318:fetch_huggingface_modelis a synchronous HTTP call not offloaded to a thread.This is an external HTTP fetch that can block the event loop, inconsistent with the other conversions in this PR.
Suggested fix
- hf_data = fetch_huggingface_model(hugging_face_id) + hf_data = await asyncio.to_thread(fetch_huggingface_model, hugging_face_id)
462-470:get_trial_analyticsis a synchronous DB call not offloaded.Same pattern as the other conversions in this PR — should be wrapped for consistency. As per coding guidelines,
**/*.py: "Use async/await for all I/O operations (database calls, HTTP requests, file operations) to maintain non-blocking execution and high concurrency."Suggested fix
- analytics = get_trial_analytics() + analytics = await asyncio.to_thread(get_trial_analytics)
1155-1171:get_all_transactionsandget_transaction_summary(line 1216) are synchronous DB calls not offloaded.These are blocking I/O operations in an async handler, consistent with the pattern this PR is fixing elsewhere. As per coding guidelines: "Use async/await for all I/O operations."
Suggested fix
- transactions = get_all_transactions( + transactions = await asyncio.to_thread( + get_all_transactions, limit=limit, ... )And similarly for
get_transaction_summaryat line 1216.
2267-2273: Redisgetcall is synchronous, blocking the event loop.In
get_chat_requests_summary_admin, the Redis client calls (redis_client.get,redis_client.setex) on lines 2267 and 2304 are synchronous. Since this PR's goal is to offload blocking I/O, these should also be wrapped or use an async Redis client.Suggested fix
- cached_data = redis_client.get(cache_key) + cached_data = await asyncio.to_thread(redis_client.get, cache_key)- redis_client.setex( + await asyncio.to_thread( + redis_client.setex, cache_key, 60, json.dumps(response, default=str) )
2279-2285:get_chat_completion_summary_by_filtersis a synchronous DB call not offloaded.Suggested fix
- summary = get_chat_completion_summary_by_filters( + summary = await asyncio.to_thread( + get_chat_completion_summary_by_filters, model_id=model_id, ... )src/routes/chat.py (1)
1073-1125: Avoid duplicate cost computation in anonymous streaming background path
calculate_cost_asyncis already awaited for tracing; the anonymous branch recomputes it again for metrics. Reuse the earliercostto avoid extra pricing lookups and latency.♻️ Suggested diff
- cost = await calculate_cost_async(model, prompt_tokens, completion_tokens) await _record_inference_metrics_and_health( provider=provider, model=model, elapsed_seconds=elapsed, prompt_tokens=prompt_tokens, completion_tokens=completion_tokens, cost=cost, success=True, error_message=None, )
| # OPTIMIZED: Add security middleware first to block bots before heavy logic | ||
| from src.config.redis_config import get_async_redis_client | ||
| try: | ||
| redis_client = get_async_redis_client() | ||
| app.add_middleware(SecurityMiddleware, redis_client=redis_client) | ||
| logger.info(" 🛡️ Security middleware enabled (IP tiering & fingerprinting)") | ||
| except Exception as e: | ||
| # Fallback to in-memory limiting if redis is unavailable | ||
| app.add_middleware(SecurityMiddleware) | ||
| logger.warning(f" 🛡️ Security middleware enabled with LOCAL fallback (Redis error: {e})") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check the signature and implementation of get_async_redis_client
rg -n "async def get_async_redis_client|def get_async_redis_client" src/config/redis_config.py -A 10Repository: Alpaca-Network/gatewayz-backend
Length of output: 57
🏁 Script executed:
#!/bin/bash
# Find redis_config.py and check its location
find . -name "redis_config.py" -type f 2>/dev/null
# Also search for get_async_redis_client across the codebase
rg "get_async_redis_client" -n 2>/dev/null | head -20
# List the config directory structure
ls -la src/config/ 2>/dev/null || echo "src/config/ not found"Repository: Alpaca-Network/gatewayz-backend
Length of output: 1059
🏁 Script executed:
#!/bin/bash
# Read the redis_config.py file
cat -n src/config/redis_config.pyRepository: Alpaca-Network/gatewayz-backend
Length of output: 12547
🏁 Script executed:
#!/bin/bash
# Check the exact import in src/main.py around lines 200-209
sed -n '195,215p' src/main.pyRepository: Alpaca-Network/gatewayz-backend
Length of output: 1063
🏁 Script executed:
#!/bin/bash
# Check if there's another redis_config.py in src/
cat -n src/redis_config.py 2>/dev/null || echo "File does not exist or is empty"Repository: Alpaca-Network/gatewayz-backend
Length of output: 3514
Fix incorrect import: get_async_redis_client does not exist
The function being imported at line 201 does not exist in src.config.redis_config. The module only provides get_redis_client() (a synchronous function returning redis.Redis | None). Change the import to:
from src.config.redis_config import get_redis_clientAnd update line 203:
redis_client = get_redis_client()This will raise an ImportError at runtime and cause the entire middleware initialization to fail before the exception handler can execute.
🤖 Prompt for AI Agents
In `@src/main.py` around lines 200 - 209, The import of get_async_redis_client is
incorrect — replace the import and usage with the synchronous get_redis_client
from the redis_config module: change the imported symbol get_async_redis_client
to get_redis_client and call get_redis_client() to obtain redis_client before
calling app.add_middleware(SecurityMiddleware, redis_client=redis_client); keep
the existing try/except fallback that calls
app.add_middleware(SecurityMiddleware) when redis_client is None or an exception
occurs so the middleware initialization doesn't crash.
| def __init__(self, app: ASGIApp, redis_client=None): | ||
| super().__init__(app) | ||
| self.redis = redis_client | ||
| # In-memory fallback if redis is missing | ||
| self._local_cache = {} | ||
| self._last_cleanup = time.time() | ||
|
|
||
| # Global Velocity Mode tracking | ||
| # Tracks (timestamp, is_error) tuples for sliding window analysis | ||
| self._request_log: deque = deque(maxlen=10000) # Cap memory usage | ||
| self._velocity_mode_until: float = 0 # Unix timestamp when velocity mode expires | ||
| self._velocity_mode_triggered_count: int = 0 # For metrics | ||
|
|
There was a problem hiding this comment.
Stateful in‑memory tracking makes velocity mode per‑process
The middleware keeps _request_log, _local_cache, and _velocity_mode_until in memory, so “global” velocity mode and fallback limits will diverge across workers. Consider moving these counters/state into Redis (or a dedicated service) and annotating the stored fields/redis client with concrete types.
As per coding guidelines: Middleware should be stateless, non-blocking, and focus on cross-cutting concerns (logging, authentication, rate limiting, tracing). Avoid business logic in middleware; delegate to services instead; and use type hints throughout all Python code (Function parameters, return types, class attributes) targeting Python 3.12+ syntax.
🤖 Prompt for AI Agents
In `@src/middleware/security_middleware.py` around lines 52 - 64, The middleware
currently keeps state in-memory in __init__ via _request_log, _local_cache,
_velocity_mode_until and so will diverge across workers; refactor to make the
middleware stateless by moving shared counters and velocity state into Redis
(use the redis_client parameter) or a dedicated service class (e.g.,
VelocityService) and delegate all business logic (sliding-window tracking,
fallback limits, velocity-mode toggling) to that service; add concrete type
hints for redis_client, _local_cache, _request_log, _velocity_mode_until, and
_velocity_mode_triggered_count using Python 3.12+ typing syntax and ensure
middleware only performs non-blocking calls to the service (no in-process state
mutations).
| try: | ||
| # Use Redis for distributed limiting | ||
| count = await self.redis.incr(full_key) | ||
| if count == 1: | ||
| await self.redis.expire(full_key, window * 2) | ||
| return count <= limit | ||
| except Exception as e: | ||
| logger.error(f"Redis security limit error: {e}") | ||
| # Fallback to local |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check if auto_sentry.py exists and inspect its contents
if [ -f "src/utils/auto_sentry.py" ]; then
echo "=== auto_sentry.py exists ==="
wc -l src/utils/auto_sentry.py
echo ""
echo "=== Function signatures in auto_sentry.py ==="
rg -n "^def |^async def" src/utils/auto_sentry.py
else
echo "auto_sentry.py NOT found"
fiRepository: Alpaca-Network/gatewayz-backend
Length of output: 786
🏁 Script executed:
# Inspect the security_middleware.py file around the specified lines
if [ -f "src/middleware/security_middleware.py" ]; then
echo "=== Lines 175-200 from security_middleware.py ==="
sed -n '175,200p' src/middleware/security_middleware.py
echo ""
echo "=== Search for Sentry imports in security_middleware.py ==="
rg -n "sentry|Sentry" src/middleware/security_middleware.py
else
echo "security_middleware.py NOT found"
fiRepository: Alpaca-Network/gatewayz-backend
Length of output: 1228
🏁 Script executed:
# Inspect the auto_capture_errors decorator and _auto_capture_exception function
echo "=== auto_capture_errors decorator (lines 54-135) ==="
sed -n '54,135p' src/utils/auto_sentry.pyRepository: Alpaca-Network/gatewayz-backend
Length of output: 3084
🏁 Script executed:
# Inspect _auto_capture_exception function
echo "=== _auto_capture_exception function ==="
sed -n '136,237p' src/utils/auto_sentry.pyRepository: Alpaca-Network/gatewayz-backend
Length of output: 3803
🏁 Script executed:
# Search for how Sentry is used elsewhere in middleware and services
echo "=== Sentry usage patterns in middleware ==="
rg -n "sentry|capture" src/middleware/ -A 2 | head -40
echo ""
echo "=== Sentry usage patterns in services ==="
rg -n "capture_exception|_auto_capture_exception" src/services/ -A 2 | head -40Repository: Alpaca-Network/gatewayz-backend
Length of output: 3508
Add Sentry capture for Redis rate limiter failures
The Redis exception handler only logs the error. Add Sentry capture with the rate limit key and threshold as context so limiter outages are visible in monitoring. Use capture_cache_error() from src/utils/auto_sentry.py to track the cache operation, key, and exception:
Example:
except Exception as e:
logger.error(f"Redis security limit error: {e}")
from src.utils.auto_sentry import capture_cache_error
capture_cache_error(
e,
operation="rate_limit_check",
cache_type="redis",
key=full_key,
details={"limit": limit, "endpoint": key}
)🤖 Prompt for AI Agents
In `@src/middleware/security_middleware.py` around lines 183 - 191, In the Redis
exception handler inside the rate limiting logic (the try/except that uses
self.redis.incr(full_key) and returns count <= limit) add a Sentry capture by
importing capture_cache_error from src.utils.auto_sentry and calling it with the
caught exception, operation="rate_limit_check", cache_type="redis",
key=full_key, and details containing {"limit": limit, "endpoint": key}; keep the
existing logger.error call and then call capture_cache_error(e, operation=...,
cache_type=..., key=full_key, details=...) before falling back to local
limiting.
| # Local in-memory fallback | ||
| if full_key not in self._local_cache: | ||
| self._local_cache[full_key] = 0 | ||
| # Periodically clean old keys | ||
| if now - self._last_cleanup > 300: | ||
| self._local_cache = {k: v for k, v in self._local_cache.items() if k.startswith(f"sec_rl:{key}:")} | ||
| self._last_cleanup = now |
There was a problem hiding this comment.
Local fallback cleanup drops unrelated keys, weakening rate limits
The cleanup keeps only keys matching the current key prefix, which wipes counters for other IPs/fingerprints every cleanup cycle. That effectively resets their limits and weakens protection.
🐛 Suggested fix
- if now - self._last_cleanup > 300:
- self._local_cache = {k: v for k, v in self._local_cache.items() if k.startswith(f"sec_rl:{key}:")}
- self._last_cleanup = now
+ if now - self._last_cleanup > 300:
+ current_bucket = bucket
+ self._local_cache = {
+ k: v
+ for k, v in self._local_cache.items()
+ if k.rsplit(":", 1)[-1].isdigit()
+ and int(k.rsplit(":", 1)[-1]) >= current_bucket - 1
+ }
+ self._last_cleanup = now🤖 Prompt for AI Agents
In `@src/middleware/security_middleware.py` around lines 193 - 199, The cleanup
currently filters _local_cache to only keys with the current prefix
(full_key/key) which deletes counters for other clients; change the strategy to
evict stale entries by age instead: modify _local_cache storage (or add a
companion dict) so each entry records a last_seen timestamp (e.g., store (count,
last_seen) or maintain _local_cache_timestamps), update the increment logic in
the same code path to set last_seen = now when touching full_key, and replace
the cleanup block that references _last_cleanup with a loop that removes entries
whose last_seen < now - 300 (rather than keeping only keys starting with
f"sec_rl:{key}:"); keep using _last_cleanup to avoid running the sweep too
often.
This pull request hardens the gateway’s security and performance by implementing a multi-layered behavioral protection system alongside critical asynchronous optimizations. Highlights include the introduction of Global Velocity Protection to mitigate DoS and IP-rotation attacks via request fingerprinting, and the resolution of a blocking runtime error in the Redis metrics service. By offloading synchronous database and pricing operations to thread pools across the chat completion path, these changes eliminate event loop starvation and significantly reduce the occurrence of 499 timeout errors during periods of high concurrency.
Summary by CodeRabbit
Release Notes
New Features
Improvements
Greptile Overview
Greptile Summary
SecurityMiddlewarethat rate-limits by IP tiering + request fingerprinting and introduces a global “velocity mode” that tightens limits during elevated error rates.asyncio.to_threadacross admin/catalog/chat routes, and introducescalculate_cost_asyncto avoid blocking cost/pricing calls.asyncio.to_threadfor some Redis operations and fixes a missingasyncioimport.Confidence Score: 2/5
get_async_redis_clientImportError) plus two functional issues that will affect production behavior (spoofable XFF rate-limits and local-cache cleanup resetting buckets) and a likely concurrency/thread-safety issue executing a Redis pipeline across threads.Important Files Changed
get_async_redis_client()causing startup ImportError.asyncio.to_threadto reduce event-loop blocking in admin endpoints.fetch_specific_model(and related gateway model fetch) to a thread withasyncio.to_threadto avoid blocking in /api/models/detail.calculate_cost_async) and offloads more synchronous operations to threads in the chat completion paths.asyncio.to_threadfor DB/cache lookups and introducescalculate_cost_async.asyncio.to_thread, but executes a pre-built pipeline in a worker thread which may violate redis client thread-safety.Sequence Diagram
sequenceDiagram participant Client participant FastAPI as FastAPI App participant Sec as SecurityMiddleware participant Timeout as RequestTimeoutMiddleware participant Route as Chat Route (/v1/chat/completions) participant Pricing as pricing.calculate_cost_async participant Redis as Redis (rate-limit/metrics) Client->>FastAPI: HTTP request FastAPI->>Sec: dispatch(request) Sec->>Sec: _get_client_ip() / _generate_fingerprint() Sec->>Redis: INCR sec_rl:ip:* + EXPIRE Sec->>Redis: INCR sec_rl:fp:* + EXPIRE alt over limit Sec-->>Client: 429 Too Many Requests else allowed Sec->>Timeout: call_next(request) Timeout->>Route: execute handler Route->>Pricing: await calculate_cost_async(model,tokens) Pricing-->>Route: cost Route-->>Timeout: response Timeout-->>Sec: response Sec->>Sec: _record_request_outcome(status) Sec->>Sec: _check_and_activate_velocity_mode() Sec-->>Client: response end