-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: self healing skyvern api key #3614
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: self healing skyvern api key #3614
Conversation
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed everything up to 8a8240e in 2 minutes and 25 seconds. Click for details.
- Reviewed
619
lines of code in9
files - Skipped
0
files when reviewing. - Skipped posting
9
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. skyvern-frontend/src/App.tsx:29
- Draft comment:
Avoid hardcoding sensitive keys such as the PostHog API key; use environment variables for configuration. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. skyvern-frontend/src/components/SelfHealApiKeyBanner.tsx:75
- Draft comment:
The conditional block returning null when no status, statusMessage, or errorMessage is present may be overly redundant. Consider refining the loading state handling (e.g., showing a spinner) or consolidating the conditions for clarity. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The code could be simplified by combining the conditions, but the current structure is clear and follows a logical flow: first check if there's nothing to show, then check if we're still loading. While a loading spinner could be added, that's more of a feature request than a clear code issue. The current behavior of showing nothing while loading seems intentional. The comment does identify a real opportunity for code simplification. The suggestion about loading spinners could be valuable for UX. However, the current code is not problematic - it's readable and works as intended. The loading spinner suggestion is more of a feature request than a code quality issue. The comment should be deleted as it doesn't identify a clear issue that needs fixing - the code works fine as is and the suggestions are more about optional enhancements.
3. skyvern/forge/api_app.py:74
- Draft comment:
Internal dev endpoints are conditionally included for local environments. Verify that these routes remain inaccessible outside local mode to avoid exposure of internal repair and status diagnostics. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
The comment is asking the PR author to verify that certain routes remain inaccessible outside of local mode. This falls under the category of asking the author to ensure behavior is intended, which is against the rules. Therefore, this comment should be removed.
4. skyvern/forge/sdk/routes/internal_auth.py:44
- Draft comment:
Internal authentication routes correctly enforce local access via _require_local_access. Ensure the IP address validation logic (_is_local_request) continues to cover all expected scenarios (loopback and private ranges). - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
This comment is asking the PR author to ensure that the IP address validation logic continues to cover all expected scenarios. This falls under the category of asking the author to double-check or ensure behavior, which is against the rules.
5. skyvern/forge/sdk/services/local_org_auth_token_service.py:21
- Draft comment:
Consider adding error handling around file operations in _write_env to gracefully handle potential I/O errors (e.g., permission issues or read-only file systems). - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 50% The suggestion is valid from a defensive programming perspective - file operations can fail. However, this is a local development utility (note the "local" in filename and SKYVERN_LOCAL_ORG constant). For local dev tools, extensive error handling may be overkill since errors will be immediately obvious to the developer. The current logging also helps with debugging. I might be underestimating the impact of file operation failures - even in local dev, graceful error handling could save developer time debugging issues. While true, the simplicity of the current implementation makes issues very transparent, and adding error handling would add complexity for limited benefit in a dev-only context. The comment should be removed as it suggests adding complexity that isn't clearly justified for a local development utility.
6. skyvern/library/skyvern.py:61
- Draft comment:
Calling migrate_db() during init may lead to unintended side effects if the constructor is invoked multiple times. Consider separating migration logic from initialization. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
7. skyvern/library/skyvern.py:191
- Draft comment:
Using asyncio.create_task for background tasks (e.g., in create_task_v1) may fail silently if errors occur. Consider attaching error handling or logging to these background tasks. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
8. tests/unit_tests/test_internal_auth.py:1
- Draft comment:
Test cases for _is_local_request cover various scenarios (public IP, loopback, private IP, and missing client). The tests look clear and complete. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
This comment is purely informative, as it only states that the test cases cover various scenarios and that they look clear and complete. It does not provide any actionable feedback or suggestions for improvement.
9. skyvern-frontend/src/App.tsx:34
- Draft comment:
Typo suggestion: Consider using 'SelfHealingApiKeyBanner' instead of 'SelfHealApiKeyBanner' for consistency with the PR title 'self healing skyvern api key'. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% This is a purely stylistic naming suggestion. The current name is clear and understandable. The suggestion is based only on matching PR title wording, which is not a strong technical reason. Component naming is a minor issue that doesn't affect functionality. This feels like unnecessary nitpicking. Maybe consistent naming across PR title and code is important for maintainability? The current name could be seen as slightly less descriptive. PR titles are temporary and often use different wording than code. The current component name is clear and follows standard React naming conventions. This comment should be deleted as it's a minor stylistic suggestion based only on PR title matching, not on any technical merit or code quality concerns.
Workflow ID: wflow_zfIvtrAm8m26zYzq
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
) | ||
|
||
|
||
@cached(cache=TTLCache(maxsize=CACHE_SIZE, ttl=AUTHENTICATION_TTL)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caching API key validation for one hour (TTL) may allow a revoked key to remain active longer than desired. Consider reducing the TTL or implementing a way to invalidate the cache on key changes.
@cached(cache=TTLCache(maxsize=CACHE_SIZE, ttl=AUTHENTICATION_TTL)) | |
@cached(cache=TTLCache(maxsize=CACHE_SIZE, ttl=300)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (6)
skyvern-frontend/src/components/SelfHealApiKeyBanner.tsx (3)
13-53
: Consider extracting getCopy to a separate constants file.While the implementation is correct, this helper function could be extracted to a constants or config file for better maintainability and testability.
Apply this refactor:
Create a new file
skyvern-frontend/src/constants/authDiagnostics.ts
:export type BannerStatus = Exclude<AuthStatusValue, "ok"> | "error"; export function getCopy(status: BannerStatus): { title: string; description: string } { // ... existing implementation }Then import in the banner component:
+import { BannerStatus, getCopy } from "@/constants/authDiagnostics"; -type BannerStatus = Exclude<AuthStatusValue, "ok"> | "error"; - -function getCopy(status: BannerStatus): { title: string; description: string } { - // ... remove implementation -}
66-80
: Simplify the early return logic.Lines 75-79 can be simplified—both branches return
null
, so they can be collapsed.Apply this diff:
- if (!bannerStatus && !statusMessage && !errorMessage) { - if (isLoading) { - return null; - } - return null; - } + if (!bannerStatus && !statusMessage && !errorMessage) { + return null; + }
85-110
: Differentiate error handling and add retry logic in handleRepair.
- Global retries are disabled (
retry: false
inQueryClient.ts
), sohandleRepair
never retries on network failures.- Introduce retry policies (via react-query’s
retry
option or axios interceptors) and display distinct messages for timeouts vs. server errors.tests/unit_tests/test_internal_auth.py (2)
6-15
: Consider adding test_query_string and server fields to the scope.While the minimal scope works for
_is_local_request
, addingquery_string
(empty bytes) andserver
fields would make the mock more realistic and prevent potential issues if the tested function is extended later.Apply this diff:
def _make_request(host: str | None) -> Request: scope = { "type": "http", "client": (host, 12345) if host else None, "headers": [], "method": "GET", "path": "/", "scheme": "http", + "query_string": b"", + "server": ("localhost", 8000), } return Request(scope)
18-35
: Consider adding IPv6 test cases.The current tests cover IPv4 addresses. Consider adding test cases for IPv6 loopback (
::1
) and private addresses to ensure complete coverage.Add these test cases:
def test_is_local_request_accepts_ipv6_loopback() -> None: request = _make_request("::1") assert _is_local_request(request) is True def test_is_local_request_accepts_ipv6_private() -> None: request = _make_request("fd00::1") # IPv6 private address assert _is_local_request(request) is Trueskyvern-frontend/src/hooks/useAuthDiagnostics.ts (1)
21-27
: Consider adding error handling to fetchDiagnostics.The
fetchDiagnostics
function doesn't explicitly handle errors. While React Query will catch thrown errors, consider adding explicit error handling or logging for better debugging.Apply this diff:
async function fetchDiagnostics(): Promise<AuthDiagnosticsResponse> { - const client = await getClient(null); - const response = await client.get<AuthDiagnosticsResponse>( - "/internal/auth/status", - ); - return response.data; + try { + const client = await getClient(null); + const response = await client.get<AuthDiagnosticsResponse>( + "/internal/auth/status", + ); + return response.data; + } catch (error) { + console.error("Failed to fetch auth diagnostics:", error); + throw error; + } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
skyvern-frontend/src/App.tsx
(2 hunks)skyvern-frontend/src/components/SelfHealApiKeyBanner.tsx
(1 hunks)skyvern-frontend/src/hooks/useAuthDiagnostics.ts
(1 hunks)skyvern/forge/api_app.py
(2 hunks)skyvern/forge/sdk/routes/internal_auth.py
(1 hunks)skyvern/forge/sdk/services/local_org_auth_token_service.py
(1 hunks)skyvern/forge/sdk/services/org_auth_service.py
(5 hunks)skyvern/library/skyvern.py
(2 hunks)tests/unit_tests/test_internal_auth.py
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
skyvern-frontend/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
skyvern-frontend/**/*.{ts,tsx}
: TypeScript in skyvern-frontend must be linted with ESLint and formatted with Prettier
Enforce a maximum line length of 120 characters in frontend TypeScript files
Files:
skyvern-frontend/src/components/SelfHealApiKeyBanner.tsx
skyvern-frontend/src/App.tsx
skyvern-frontend/src/hooks/useAuthDiagnostics.ts
{skyvern,integrations,alembic,scripts}/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
{skyvern,integrations,alembic,scripts}/**/*.py
: Use Python 3.11+ features and add type hints throughout the codebase
Follow PEP 8 with a maximum line length of 100 characters
Use absolute imports for all Python modules
Document all public functions and classes with Google-style docstrings
Use snake_case for variables and functions, and PascalCase for classes
Prefer async/await over callbacks in asynchronous code
Use asyncio for concurrency
Always handle exceptions in async code
Use context managers for resource cleanup
Use specific exception classes
Include meaningful error messages when raising or logging exceptions
Log errors with appropriate severity levels
Never expose sensitive information in error messages
Files:
skyvern/forge/api_app.py
skyvern/forge/sdk/services/local_org_auth_token_service.py
skyvern/forge/sdk/routes/internal_auth.py
skyvern/library/skyvern.py
skyvern/forge/sdk/services/org_auth_service.py
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py
: Python code must be linted and formatted with Ruff
Use type hints throughout Python code
Prefer async/await for asynchronous Python code
Enforce a maximum line length of 120 characters in Python files
Files:
skyvern/forge/api_app.py
skyvern/forge/sdk/services/local_org_auth_token_service.py
skyvern/forge/sdk/routes/internal_auth.py
tests/unit_tests/test_internal_auth.py
skyvern/library/skyvern.py
skyvern/forge/sdk/services/org_auth_service.py
skyvern/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Type-check Python code in the skyvern/ package with mypy
Files:
skyvern/forge/api_app.py
skyvern/forge/sdk/services/local_org_auth_token_service.py
skyvern/forge/sdk/routes/internal_auth.py
skyvern/library/skyvern.py
skyvern/forge/sdk/services/org_auth_service.py
tests/unit_tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Place unit tests under tests/unit_tests/
Files:
tests/unit_tests/test_internal_auth.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use pytest with async support for Python tests
Files:
tests/unit_tests/test_internal_auth.py
🧬 Code graph analysis (6)
skyvern-frontend/src/components/SelfHealApiKeyBanner.tsx (2)
skyvern-frontend/src/hooks/useAuthDiagnostics.ts (2)
AuthStatusValue
(6-12)useAuthDiagnostics
(39-39)skyvern-frontend/src/api/AxiosClient.ts (1)
getClient
(111-111)
skyvern/forge/sdk/services/local_org_auth_token_service.py (4)
skyvern/forge/sdk/schemas/organizations.py (1)
Organization
(8-21)skyvern/forge/sdk/db/enums.py (1)
OrganizationAuthTokenType
(4-7)skyvern/forge/sdk/db/client.py (4)
get_organization_by_domain
(819-823)create_organization
(825-845)invalidate_org_auth_tokens
(1010-1031)create_org_auth_token
(974-1008)skyvern/forge/sdk/core/security.py (1)
create_access_token
(11-27)
skyvern/forge/sdk/routes/internal_auth.py (2)
skyvern/forge/sdk/services/local_org_auth_token_service.py (2)
fingerprint_token
(29-30)regenerate_local_api_key
(47-74)skyvern/forge/sdk/services/org_auth_service.py (1)
resolve_org_from_api_key
(147-198)
tests/unit_tests/test_internal_auth.py (1)
skyvern/forge/sdk/routes/internal_auth.py (1)
_is_local_request
(33-41)
skyvern/library/skyvern.py (1)
skyvern/forge/sdk/db/client.py (2)
get_organization_by_domain
(819-823)create_organization
(825-845)
skyvern/forge/sdk/services/org_auth_service.py (4)
skyvern/forge/sdk/schemas/organizations.py (2)
Organization
(8-21)OrganizationAuthToken
(33-34)skyvern/forge/sdk/db/enums.py (1)
OrganizationAuthTokenType
(4-7)skyvern/forge/sdk/models.py (1)
TokenPayload
(138-140)skyvern/forge/sdk/core/skyvern_context.py (1)
current
(70-77)
🪛 Ruff (0.13.3)
skyvern/forge/sdk/routes/internal_auth.py
53-53: Possible hardcoded password assigned to: "token_candidate"
(S105)
71-71: Use logging.exception
instead of logging.error
Replace with exception
(TRY400)
75-75: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
🔇 Additional comments (12)
skyvern-frontend/src/components/SelfHealApiKeyBanner.tsx (2)
11-11
: Type definition is clear and correct.The
BannerStatus
type correctly excludes "ok" fromAuthStatusValue
and adds "error" for query failures.
112-152
: LGTM! UI rendering and conditional display logic are well-structured.The component correctly:
- Hides in production (checked at line 61-64)
- Conditionally renders repair button vs error message
- Displays status/error messages appropriately
- Uses accessible Alert components
skyvern-frontend/src/App.tsx (1)
34-34
: LGTM! Banner placement is appropriate.Rendering
SelfHealApiKeyBanner
before theRouterProvider
ensures it displays regardless of the current route, which is correct for a global diagnostic banner.skyvern/forge/sdk/services/local_org_auth_token_service.py (3)
29-30
: LGTM! Token fingerprinting is appropriate.The
fingerprint_token
function provides a safe way to log tokens without exposing sensitive data. The fallback for short tokens prevents accidental leakage.
33-44
: LGTM! Local org creation follows best practices.The
ensure_local_org
function correctly:
- Checks if the org exists before creating
- Uses predefined constants for domain and name
- Sets reasonable defaults for limits
47-74
: Validate local API key expiration duration
API_KEY_LIFETIME
is set to 5200 weeks (~100 years). Confirm this is intentional for local development; consider reducing it to a shorter, more secure timeframe.skyvern/forge/api_app.py (1)
73-77
: LGTM! Local-only route registration is correct.The conditional registration of
internal_auth.router
only in local environments is appropriate for development-only diagnostics. Registering on all API version prefixes ensures consistent access.skyvern/library/skyvern.py (1)
99-106
: LGTM! Constant usage improves maintainability.Replacing hard-coded strings with
SKYVERN_LOCAL_DOMAIN
andSKYVERN_LOCAL_ORG
centralizes configuration and reduces the risk of typos or inconsistencies.skyvern-frontend/src/hooks/useAuthDiagnostics.ts (1)
29-37
: LGTM! Hook configuration is appropriate for dev diagnostics.The hook configuration is correct:
- Enabled only in development
- No retry (diagnostics should reflect current state)
- No refetch on window focus (avoids unnecessary requests)
skyvern/forge/sdk/services/org_auth_service.py (3)
31-36
: LGTM! Dataclass encapsulates validation results well.The
ApiKeyValidationResult
dataclass appropriately groups the organization, payload, and token, making the validation flow clearer and more maintainable.
201-211
: LGTM! Refactored function maintains existing behavior.The
_get_current_org_cached
function correctly delegates toresolve_org_from_api_key
and maintains the context-setting logic. The caching decorator remains in place, preserving performance characteristics.
147-198
: Confirm caching remains effective
All calls toresolve_org_from_api_key
are funneled through the decorated_get_current_org_cached
, and no direct invocations bypass the cache.
class AuthStatus(str, Enum): | ||
missing_env = "missing_env" | ||
invalid_format = "invalid_format" | ||
invalid = "invalid" | ||
expired = "expired" | ||
not_found = "not_found" | ||
ok = "ok" | ||
|
||
|
||
class DiagnosticsResult(NamedTuple): | ||
status: AuthStatus | ||
detail: str | None | ||
validation: Any | None | ||
token: str | None | ||
|
||
|
||
def _is_local_request(request: Request) -> bool: | ||
host = request.client.host if request.client else None | ||
if not host: | ||
return False | ||
try: | ||
addr = ipaddress.ip_address(host) | ||
except ValueError: | ||
return False | ||
return addr.is_loopback or addr.is_private | ||
|
||
|
||
def _require_local_access(request: Request) -> None: | ||
if settings.ENV != "local": | ||
raise HTTPException(status.HTTP_403_FORBIDDEN, "Endpoint only available in local env") | ||
if not _is_local_request(request): | ||
raise HTTPException(status.HTTP_403_FORBIDDEN, "Endpoint requires localhost access") | ||
|
||
|
||
async def _evaluate_local_api_key(token: str) -> DiagnosticsResult: | ||
token_candidate = token.strip() | ||
if not token_candidate or token_candidate == "YOUR_API_KEY": | ||
return DiagnosticsResult(status=AuthStatus.missing_env, detail=None, validation=None, token=None) | ||
|
||
try: | ||
validation = await resolve_org_from_api_key(token_candidate, app.DATABASE) | ||
except HTTPException as exc: | ||
if exc.status_code == status.HTTP_404_NOT_FOUND: | ||
return DiagnosticsResult(status=AuthStatus.not_found, detail=None, token=None, validation=None) | ||
|
||
detail_text = exc.detail if isinstance(exc.detail, str) else None | ||
if exc.status_code == status.HTTP_403_FORBIDDEN: | ||
status_value = AuthStatus.invalid | ||
if detail_text and "expired" in detail_text.lower(): | ||
status_value = AuthStatus.expired | ||
elif detail_text and "validate" in detail_text.lower(): | ||
status_value = AuthStatus.invalid_format | ||
return DiagnosticsResult(status=status_value, detail=detail_text, token=None, validation=None) | ||
|
||
LOG.error("Unexpected error while diagnosing API key", status_code=exc.status_code, detail=detail_text) | ||
raise | ||
except Exception: | ||
LOG.error("Unexpected exception while diagnosing API key", exc_info=True) | ||
raise HTTPException(status.HTTP_500_INTERNAL_SERVER_ERROR, "Unable to diagnose API key") | ||
|
||
return DiagnosticsResult(status=AuthStatus.ok, detail=None, validation=validation, token=token_candidate) | ||
|
||
|
||
def _emit_diagnostics(result: DiagnosticsResult) -> dict[str, object]: | ||
status_value = result.status.value | ||
|
||
if result.status is AuthStatus.ok and result.validation and result.token: | ||
fingerprint = fingerprint_token(result.token) | ||
LOG.info( | ||
"Local auth diagnostics", | ||
status=status_value, | ||
organization_id=result.validation.organization.organization_id, | ||
fingerprint=fingerprint, | ||
expires_at=result.validation.payload.exp, | ||
) | ||
return { | ||
"status": status_value, | ||
"organization_id": result.validation.organization.organization_id, | ||
"fingerprint": fingerprint, | ||
"expires_at": result.validation.payload.exp, | ||
} | ||
|
||
log_kwargs: dict[str, object] = {"status": status_value} | ||
if result.detail: | ||
log_kwargs["detail"] = result.detail | ||
|
||
LOG.warning("Local auth diagnostics", **log_kwargs) | ||
|
||
return {"status": status_value} | ||
|
||
|
||
@router.post("/repair") | ||
async def repair_api_key(request: Request) -> dict[str, object]: | ||
_require_local_access(request) | ||
|
||
token, organization_id = await regenerate_local_api_key() | ||
|
||
return { | ||
"status": AuthStatus.ok.value, | ||
"organization_id": organization_id, | ||
"fingerprint": fingerprint_token(token), | ||
} | ||
|
||
|
||
@router.get("/status") | ||
async def auth_status(request: Request) -> dict[str, object]: | ||
_require_local_access(request) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Add Google-style docstrings for public API.
Project guidelines require Google-style docstrings on public classes and endpoints. Please add them to AuthStatus
, DiagnosticsResult
, repair_api_key
, and auth_status
so future contributors see intent and response semantics.
🧰 Tools
🪛 Ruff (0.13.3)
53-53: Possible hardcoded password assigned to: "token_candidate"
(S105)
71-71: Use logging.exception
instead of logging.error
Replace with exception
(TRY400)
75-75: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
🤖 Prompt for AI Agents
In skyvern/forge/sdk/routes/internal_auth.py around lines 17 to 123, add
Google-style docstrings for the public API items requested: AuthStatus,
DiagnosticsResult, repair_api_key, and auth_status. For AuthStatus and
DiagnosticsResult add a brief one-line description and for DiagnosticsResult
document attributes (status, detail, validation, token). For repair_api_key and
auth_status add a one-line summary, Args (request: Request), Returns describing
the dict keys and types (status, organization_id, fingerprint and any other
keys), and any Raises (HTTPException for access control), placing each docstring
immediately below the class or function declaration and following Google
docstring conventions.
def _is_local_request(request: Request) -> bool: | ||
host = request.client.host if request.client else None | ||
if not host: | ||
return False | ||
try: | ||
addr = ipaddress.ip_address(host) | ||
except ValueError: | ||
return False | ||
return addr.is_loopback or addr.is_private | ||
|
||
|
||
def _require_local_access(request: Request) -> None: | ||
if settings.ENV != "local": | ||
raise HTTPException(status.HTTP_403_FORBIDDEN, "Endpoint only available in local env") | ||
if not _is_local_request(request): | ||
raise HTTPException(status.HTTP_403_FORBIDDEN, "Endpoint requires localhost access") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tighten local-access guard to loopback-only.
_is_local_request
currently accepts any private address (addr.is_private
). That includes the entire LAN (e.g., 192.168.0.15, 10.0.0.2), so anyone on the same network can hit /internal/auth/*
whenever the dev server binds to 0.0.0.0. Because these endpoints regenerate credentials, that exposure violates the “localhost only” guarantee and opens a lateral-movement vector. Limit access strictly to loopback addresses (IPv4 127.0.0.0/8 and IPv6 ::1) and keep the error message accurate. For example:
- return addr.is_loopback or addr.is_private
+ return addr.is_loopback
If you need to permit additional cases, please gate them intentionally (e.g., explicit allowlist), not via the broad is_private
.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def _is_local_request(request: Request) -> bool: | |
host = request.client.host if request.client else None | |
if not host: | |
return False | |
try: | |
addr = ipaddress.ip_address(host) | |
except ValueError: | |
return False | |
return addr.is_loopback or addr.is_private | |
def _require_local_access(request: Request) -> None: | |
if settings.ENV != "local": | |
raise HTTPException(status.HTTP_403_FORBIDDEN, "Endpoint only available in local env") | |
if not _is_local_request(request): | |
raise HTTPException(status.HTTP_403_FORBIDDEN, "Endpoint requires localhost access") | |
def _is_local_request(request: Request) -> bool: | |
host = request.client.host if request.client else None | |
if not host: | |
return False | |
try: | |
addr = ipaddress.ip_address(host) | |
except ValueError: | |
return False | |
return addr.is_loopback | |
def _require_local_access(request: Request) -> None: | |
if settings.ENV != "local": | |
raise HTTPException(status.HTTP_403_FORBIDDEN, "Endpoint only available in local env") | |
if not _is_local_request(request): | |
raise HTTPException(status.HTTP_403_FORBIDDEN, "Endpoint requires localhost access") |
🤖 Prompt for AI Agents
In skyvern/forge/sdk/routes/internal_auth.py around lines 33 to 48,
_is_local_request currently treats any private IP as local (addr.is_private)
which allows LAN hosts access; change the logic to accept only loopback
addresses (addr.is_loopback) and remove the is_private check so only 127.0.0.0/8
and ::1 are allowed, and update the _require_local_access error message to
reflect "loopback/localhost only" (or add an explicit allowlist gated by a
config value if you need additional non-loopback exceptions).
def _write_env(path: Path, key: str, value: str) -> None: | ||
path.parent.mkdir(parents=True, exist_ok=True) | ||
if not path.exists(): | ||
path.touch() | ||
set_key(str(path), key, value) | ||
LOG.info(".env written", path=(str(path)), key=key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify _write_env handles concurrent writes safely.
The _write_env
function uses set_key
from python-dotenv, which may not be atomic. If multiple processes or the frontend file watcher trigger simultaneous writes, corruption or race conditions could occur.
Run the following script to check if set_key
operations elsewhere in the codebase use locking or other concurrency protection:
🏁 Script executed:
#!/bin/bash
# Search for other uses of set_key and check for locking patterns
rg -nP --type=py 'set_key|dotenv|\.env' --no-heading | head -50
Length of output: 4495
Add file locking around .env
writes in _write_env
Wrap the set_key
call in a file-based lock (e.g., via fcntl
or the filelock
library) in skyvern/forge/sdk/services/local_org_auth_token_service.py:21–26 to ensure atomic updates and prevent race conditions.
🤖 Prompt for AI Agents
In skyvern/forge/sdk/services/local_org_auth_token_service.py around lines 21 to
26, the call to set_key writes to the .env file without any file locking which
can cause race conditions; wrap the set_key call in a file-based lock (for
example using the filelock.FileLock or an fcntl-based lock) created in the same
directory (e.g., path.with_suffix(path.suffix + ".lock") or path.parent /
".env.lock"), acquire the lock with a reasonable timeout via a context manager,
perform set_key while the lock is held, and ensure the lock is always released
(use try/finally or the context manager) and propagate or log errors on
timeout/failure so concurrent processes don’t corrupt the .env file.
_write_env(ROOT_ENV_PATH, "SKYVERN_API_KEY", api_key) | ||
_write_env(FRONTEND_ENV_PATH, "VITE_SKYVERN_API_KEY", api_key) | ||
settings.SKYVERN_API_KEY = api_key | ||
os.environ["SKYVERN_API_KEY"] = api_key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mutating settings and os.environ may not propagate to all contexts.
Lines 64-67 write to .env
files and update settings.SKYVERN_API_KEY
and os.environ
. However, these mutations may not propagate to already-running processes or threads. Consider documenting that a restart may be required or implementing a more robust configuration reload mechanism.
Document this behavior in the function docstring:
async def regenerate_local_api_key() -> tuple[str, str]:
- """Create a fresh API key for the local organization and persist it to env files."""
+ """
+ Create a fresh API key for the local organization and persist it to env files.
+
+ Note: The API key is written to .env files and updated in settings and os.environ.
+ Some contexts may require a process restart to pick up the new key.
+ """
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
_write_env(ROOT_ENV_PATH, "SKYVERN_API_KEY", api_key) | |
_write_env(FRONTEND_ENV_PATH, "VITE_SKYVERN_API_KEY", api_key) | |
settings.SKYVERN_API_KEY = api_key | |
os.environ["SKYVERN_API_KEY"] = api_key | |
async def regenerate_local_api_key() -> tuple[str, str]: | |
""" | |
Create a fresh API key for the local organization and persist it to env files. | |
Note: The API key is written to .env files and updated in settings and os.environ. | |
Some contexts may require a process restart to pick up the new key. | |
""" | |
_write_env(ROOT_ENV_PATH, "SKYVERN_API_KEY", api_key) | |
_write_env(FRONTEND_ENV_PATH, "VITE_SKYVERN_API_KEY", api_key) | |
settings.SKYVERN_API_KEY = api_key | |
os.environ["SKYVERN_API_KEY"] = api_key |
🤖 Prompt for AI Agents
In skyvern/forge/sdk/services/local_org_auth_token_service.py around lines 64 to
67, the function mutates .env files and updates settings.SKYVERN_API_KEY and
os.environ which may not propagate to already-running processes or threads;
update the function docstring to state this behavior clearly and instruct
callers that a restart of the process (or an explicit configuration reload) may
be required for changes to take effect, and optionally mention implementing a
more robust reload mechanism if live update is needed.
SKYVERN_LOCAL_DOMAIN = "skyvern.local" | ||
|
||
|
||
def _write_env(path: Path, key: str, value: str) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to be careful merging this so we don't allow this in our cloud instance by accident!
LOG = structlog.get_logger() | ||
PROJECT_ROOT = Path(__file__).resolve().parents[4] | ||
ROOT_ENV_PATH = PROJECT_ROOT / ".env" | ||
FRONTEND_ENV_PATH = PROJECT_ROOT / "skyvern-frontend" / ".env" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this work for pip installed skyvern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the file writing would, but this change relies on the frontend dev server behavior hot reloading once the env file is modified, but you'd probably be doing skyvern run ui
or something like that when using pip installed skyvern, so that wouldn't work. i'd have to think about how to support that usecase...
|
||
|
||
def _require_local_access(request: Request) -> None: | ||
if settings.ENV != "local": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
from skyvern.forge.sdk.services.local_org_auth_token_service import fingerprint_token, regenerate_local_api_key | ||
from skyvern.forge.sdk.services.org_auth_service import resolve_org_from_api_key | ||
|
||
router = APIRouter(prefix="/internal/auth", tags=["internal"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you hide this from localhost:8000/docs by default?
Adds a dev-only
SelfHealApiKeyBanner
that monitors the new/internal/auth/status
endpoint and, when the key is missing/invalid/expired/misconfigured, surfaces the error + a single-click repair button that mints a new local key, rewrites both .env files, and hides itself on success.🔧 This PR introduces a self-healing API key system for Skyvern's local development environment that automatically detects and repairs authentication issues. The implementation adds a diagnostic banner in the frontend that can detect various API key problems (missing, expired, invalid format, etc.) and provides a one-click repair mechanism that regenerates the key and updates environment files automatically.
🔍 Detailed Analysis
Key Changes
SelfHealApiKeyBanner
that displays authentication status and provides repair functionality in development mode onlyuseAuthDiagnostics
hook and/internal/auth/status
endpoint to check API key validity with detailed error categorization/internal/auth/repair
endpoint that regenerates API keys and automatically updates both root and frontend.env
filesresolve_org_from_api_key
functionTechnical Implementation
Impact
Created with Palmier
Important
Add self-healing mechanism for Skyvern API key with frontend diagnostics and backend regeneration support for local development.
SelfHealApiKeyBanner
component inApp.tsx
to display API key status and regeneration option.SelfHealApiKeyBanner
inSelfHealApiKeyBanner.tsx
to handle various API key issues and provide a regeneration button.useAuthDiagnostics
hook inuseAuthDiagnostics.ts
to fetch API key status.internal_auth.py
to handle API key diagnostics and regeneration endpoints.regenerate_local_api_key()
inlocal_org_auth_token_service.py
to create and store a new API key.api_app.py
to includeinternal_auth
router for local development.resolve_org_from_api_key()
inorg_auth_service.py
to validate API keys.test_internal_auth.py
to verify local request handling.This description was created by
for 8a8240e. You can customize this summary. It will automatically update as commits are pushed.