Skip to content

fix: skip data: URLs in downloadAssets to prevent SSRF false positive#13204

Open
MaxwellCalkin wants to merge 1 commit intovercel:mainfrom
MaxwellCalkin:fix/data-url-ssrf-false-positive
Open

fix: skip data: URLs in downloadAssets to prevent SSRF false positive#13204
MaxwellCalkin wants to merge 1 commit intovercel:mainfrom
MaxwellCalkin:fix/data-url-ssrf-false-positive

Conversation

@MaxwellCalkin
Copy link

Note: This PR was authored by Claude (AI), operated by @MaxwellCalkin.

Fixes #13103

Problem

The SSRF protection added in @ai-sdk/provider-utils@4.0.19 rejects data: URLs during the downloadAssets phase, breaking inline file attachments (images, PDFs) sent as base64 data URLs.

In downloadAssets(), string data from file/image parts is converted to URL objects via new URL(data). Since data:image/png;base64,... is a valid URL, it becomes a URL instance and passes the instanceof URL filter. The default download function then calls validateDownloadUrl(), which correctly rejects non-http(s) protocols — but data: URLs are inline data, not remote resources.

Fix

Filter out data: URLs from planned downloads in downloadAssets() by adding && part.data.protocol !== "data:" to the existing instanceof URL check. This lets data: URLs bypass the download pipeline entirely and flow through to convertToLanguageModelV4DataContent(), which already handles them correctly by extracting the base64 content.

Changed files

  • packages/ai/src/prompt/convert-to-language-model-prompt.ts: Added part.data.protocol !== "data:" to the URL filter in downloadAssets()
  • packages/ai/src/prompt/convert-to-language-model-prompt.test.ts: Added two tests verifying data: URLs are not passed to the download function (one for image parts, one for file parts)

Before

Error [AI_DownloadError]: URL scheme must be http or https, got data:
    at validateDownloadUrl
    at download
    at downloadAssets

After

data: URLs are correctly skipped during download planning and their inline base64 content is extracted by the existing convertToLanguageModelV4DataContent() path.

data: URLs contain inline base64 content and should not be passed to
the download function. The SSRF protection in validateDownloadUrl()
correctly rejects non-http(s) protocols, but data: URLs are not remote
resources — they are inline data that is already handled downstream by
convertToLanguageModelV4DataContent().

Filter out data: URLs from planned downloads so they flow through to
the existing base64 extraction path instead of hitting the download
validation.
@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@tigent tigent bot added ai/core core functions like generateText, streamText, etc. Provider utils, and provider spec. bug Something isn't working as documented labels Mar 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai/core core functions like generateText, streamText, etc. Provider utils, and provider spec. bug Something isn't working as documented

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SSRF protection in provider-utils@4.0.19 rejects data: URLs in downloadAssets

1 participant