Skip to content

SSRF protection in provider-utils@4.0.19 rejects data: URLs in downloadAssets #13103

@Drosscend

Description

@Drosscend

Description

The SSRF protection added in @ai-sdk/provider-utils@4.0.19 (commit ad4cfc2, shipped with ai@6.0.116) rejects data: URLs during the downloadAssets phase, breaking inline file attachments (images, PDFs) sent as base64 data URLs.

Root Cause

In downloadAssets(), string data from file parts is converted to URL objects via new URL(data) (line 366). Since data:image/png;base64,... is a valid URL, it becomes a URL instance and passes the part.data instanceof URL filter (line 374).

The default download function (createDefaultDownloadFunction) then delegates to download() in @ai-sdk/provider-utils, which now calls validateDownloadUrl(). This function rejects any URL where protocol !== "http:" && protocol !== "https:" — including data: URLs.

This is a false positive: data: URLs don't need downloading — they contain inline data. The SDK already handles them correctly downstream in convertToLanguageModelV4DataContent(), which extracts the base64 content from data: URLs. But downloadAssets crashes before this code ever runs.

Flow

  1. User sends a message with an inline image/PDF as data:image/png;base64,...
  2. convertToModelMessages sets data: part.url (the data URL string)
  3. downloadAssets converts it to a URL object → passes instanceof URL filter
  4. Anthropic provider doesn't declare data: as a supported URL scheme → isUrlSupportedByModel is false
  5. Default download function calls download()validateDownloadUrl() rejects data: protocol
  6. Error: AI_DownloadError: URL scheme must be http or https, got data:

Error

Error [AI_DownloadError]: URL scheme must be http or https, got data:
    at validateDownloadUrl (packages/provider-utils/src/validate-download-url.ts:22)
    at download (packages/provider-utils/src/download.ts)
    at downloadAssets (packages/ai/src/prompt/convert-to-language-model-prompt.ts:389)

Suggested Fix

downloadAssets should filter out data: URLs before sending them to the download function, since they are inline data and not remote resources:

// In downloadAssets(), line 374, change the filter from:
.filter(
  (part): part is { mediaType: string | undefined; data: URL } =>
    part.data instanceof URL,
)

// To:
.filter(
  (part): part is { mediaType: string | undefined; data: URL } =>
    part.data instanceof URL && part.data.protocol !== 'data:',
)

Workaround

Using experimental_download in streamText to skip data: URLs:

import { createDownload, streamText } from "ai";

const singleDownload = createDownload();

streamText({
  // ...
  experimental_download: (requestedDownloads) =>
    Promise.all(
      requestedDownloads.map(async (req) => {
        if (req.isUrlSupportedByModel || req.url.protocol === "data:") {
          return null;
        }
        return singleDownload(req);
      }),
    ),
});

AI SDK Version

  • ai: 6.0.116
  • @ai-sdk/anthropic: 3.0.58
  • @ai-sdk/provider-utils: 4.0.19

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    ai/corecore functions like generateText, streamText, etc. Provider utils, and provider spec.ai/providerrelated to a provider package. Must be assigned together with at least one `provider/*` labelbugSomething isn't working as documentedreproduction provided

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions