Skip to content

Fix storage config for v2 dask tasks#1095

Open
katrogan wants to merge 2 commits into
mainfrom
auxiliary-testing
Open

Fix storage config for v2 dask tasks#1095
katrogan wants to merge 2 commits into
mainfrom
auxiliary-testing

Conversation

@katrogan
Copy link
Copy Markdown
Contributor

Add an AWS_WEB_IDENTITY_TOKEN_FILE branch that wires up obstore.auth.boto3.Boto3CredentialProvider.

Why these changes are required:
I observed a v2 dask task died at pod startup before any user code ran:

  PermissionDeniedError: ... HEAD https://s3.us-east-2.amazonaws.com/
    union-cloud-dogfood-2-dogfood-fast-registration/.../fast79780202...tar.gz
    ... 403 Forbidden (empty body)

The runner pod's entrypoint calls flyte.storage.get(s3://...) to download the fast-registration tarball but obbstore couldn't auth.

According to Claude:

obstore's S3Store(...) constructor calls Rust object_store::aws::AmazonS3Builder::from_env(), and that builder only reads static keys: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, AWS_REGION, AWS_ENDPOINT_URL.

It does not consult AWS_WEB_IDENTITY_TOKEN_FILE / AWS_ROLE_ARN (which is how IRSA works on EKS). With no static creds present, the builder falls back to InstanceCredentialProvider → IMDS → the EKS node instance role, not the pod's
service-account role. The node role doesn't have S3 access to the fast-registration bucket → 403 with empty body (characteristic of an unsigned-looking request failure).

S3.get_fsspec_kwargs() only wired a credential_provider when both AWS_PROFILE and AWS_CONFIG_FILE were set (the laptop/dev-machine case). On EKS with IRSA, neither is set, so the function returned without a credential provider → obstore went anonymous → 403.

katrogan added 2 commits May 21, 2026 20:23
Signed-off-by: Katrina Rogan <katroganGH@gmail.com>
Signed-off-by: Katrina Rogan <katroganGH@gmail.com>
@katrogan katrogan marked this pull request as ready for review May 21, 2026 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant