Skip to content

fix: avoid global k8s Configuration singleton race in read_config_map#764

Open
lilyz-ai wants to merge 2 commits intomainfrom
fix/configmap-k8s-config-race
Open

fix: avoid global k8s Configuration singleton race in read_config_map#764
lilyz-ai wants to merge 2 commits intomainfrom
fix/configmap-k8s-config-race

Conversation

@lilyz-ai
Copy link
Collaborator

@lilyz-ai lilyz-ai commented Feb 23, 2026

Summary

  • configmap.py::read_config_map called kube_config.load_incluster_config() on every request, mutating the kubernetes_asyncio global Configuration._default singleton
  • Under concurrent async requests, a race causes CoreV1Api() to construct an ApiClient from a transiently empty global config (no host, no token) → 401 Unauthorized
  • Fix: create an explicit client.Configuration() per call, pass it to load_incluster_config/load_kube_config, and bind ApiClient to it directly — the global singleton is never touched

Test plan

  • Deploy to staging and confirm configmap reads no longer produce 401 errors under concurrent load
  • Verify endpoint creation succeeds when inference_framework_image_tag is unset (triggers _get_latest_tagread_config_map)

🤖 Generated with Claude Code

Greptile Summary

This PR fixes a concurrency race condition in read_config_map where calling kube_config.load_incluster_config() on every request mutated the global Configuration._default singleton. Under concurrent async requests, CoreV1Api() could construct an ApiClient from a transiently empty global config (no host, no token), resulting in 401 Unauthorized errors.

Key changes:

  • Creates an explicit client.Configuration() instance per call instead of relying on the mutable global singleton
  • Passes client_configuration=configuration to both load_incluster_config and load_kube_config
  • Uses async with client.ApiClient(configuration) to scope the API client lifecycle and ensure proper cleanup
  • Binds CoreV1Api to the explicit api_client rather than the global default

Note: The same global-singleton pattern still exists in k8s_endpoint_resource_delegate.py:maybe_load_kube_config(), core/docker/remote_build.py, k8s_startup_watcher/watcher.py, and entrypoints/k8s_cache.py. These may be less susceptible since some use a load-once guard (_kube_config_loaded flag), but the pattern could be worth fixing in those locations too for consistency.

Confidence Score: 5/5

  • This PR is safe to merge — it correctly isolates k8s client configuration per call, eliminating the race condition with no behavioral changes.
  • The change is small, focused, and follows the well-documented kubernetes_asyncio pattern for avoiding global state. It creates an explicit Configuration per invocation, passes it through load_incluster_config/load_kube_config, and binds ApiClient directly — the global singleton is never touched. The async context manager ensures proper resource cleanup. No new dependencies, no API changes, and the fix directly addresses the documented 401 race condition.
  • No files require special attention. The single changed file is straightforward and correct.

Important Files Changed

Filename Overview
model-engine/model_engine_server/core/configmap.py Replaces global k8s Configuration singleton mutation with per-call Configuration instances and explicit ApiClient binding, fixing race condition under concurrent async requests. Properly uses async with for ApiClient cleanup.

Sequence Diagram

sequenceDiagram
    participant Caller as Caller (e.g. _get_latest_tag)
    participant RCM as read_config_map()
    participant Cfg as client.Configuration()
    participant KubeConfig as kube_config
    participant ApiClient as client.ApiClient
    participant CoreV1 as CoreV1Api
    participant K8sAPI as K8s API Server

    Caller->>RCM: read_config_map(name, namespace)
    RCM->>Cfg: new Configuration() (per-call instance)
    RCM->>KubeConfig: load_incluster_config(client_configuration=cfg)
    KubeConfig-->>Cfg: populates host, token, certs
    RCM->>ApiClient: async with ApiClient(cfg)
    ApiClient->>CoreV1: CoreV1Api(api_client)
    CoreV1->>K8sAPI: read_namespaced_config_map(name, namespace)
    K8sAPI-->>CoreV1: config_map response
    CoreV1-->>RCM: config_map.data
    RCM-->>Caller: Dict[str, str]
    Note over RCM,ApiClient: ApiClient cleaned up via async context manager
Loading

Last reviewed commit: 0ae4260

lilyz-ai and others added 2 commits February 23, 2026 23:24
Each call to read_config_map was mutating the kubernetes_asyncio global
Configuration._default via load_incluster_config(). Concurrent async
requests could race each other, leaving the global transiently empty so
that CoreV1Api() would construct an ApiClient with no host/token → 401.

Fix by passing an explicit client_configuration object to
load_incluster_config/load_kube_config and binding ApiClient to it
directly, so no shared global state is touched.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@lilyz-ai lilyz-ai requested a review from dmchoiboi February 24, 2026 01:35
@lilyz-ai lilyz-ai enabled auto-merge (squash) February 24, 2026 01:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants