Skip to content

ai-proxy: streaming not working in 3.9.0 #14425

Open
@muscionig

Description

@muscionig

Is there an existing issue for this?

  • I have searched the existing issues

Kong version ($ kong version)

3.8.0, 3.9.0

Current Behavior

When using kong with the ai-proxy, streaming responses are "buffered" in kong 3.9.0 but are working with the same configuration in 3.8.0. See below for a fully reproducible example.

Expected Behavior

When requesting streaming, kong should correctly serve each SSE as soon as those are received.

Steps To Reproduce

You can use this docker-compose.yaml, you may need to comment/uncomment the kong migrations bootstrap/up piece:

name: local-kong
version: "3.4"

x-common-variables: &common-variables
  KONG_PG_PORT: 5432
  KONG_PG_USER: kong
  KONG_PG_PASSWORD: kong
  KONG_PG_DATABASE: kong
  POSTGRES_USER: kong
  POSTGRES_PASSWORD: kong
  POSTGRES_DB: kong
  KONG_HTTP_PROXY_PORT: 8000
  KONG_HTTP_ADMIN_PORT: 8001


services:
  kong-db:
    networks:
      - local-kong-net
    image: postgres:17.2-alpine
    ports:
      - 5432:5432
    environment:
      <<: *common-variables

  kong:
    platform: "linux/amd64"
    networks:
      - local-kong-net
    image: kong:3.8.0
    ports:
      - 8000:8000
      - 8443:8443
      - 8001:8001
      - 8002:8002
      - 8444:8444
    environment:
      <<: *common-variables
      KONG_DATABASE:    postgres
      KONG_PG_HOST:     kong-db
      KONG_ADMIN_LISTEN: 0.0.0.0:8001
      KONG_ADMIN_GUI_URL: http://localhost:8002
      KONG_PROXY_LISTEN: 0.0.0.0:8000
      KONG_PROXY_ACCESS_LOG: /dev/stdout
      KONG_ADMIN_ACCESS_LOG: /dev/stdout
      KONG_PROXY_ERROR_LOG:  /dev/stderr
      KONG_ADMIN_ERROR_LOG:  /dev/stderr
      KONG_PLUGINS: bundled
      KONG_LOG_LEVEL: info
    depends_on:
      - kong-db
    # command: ["kong", "migrations", "bootstrap"]

networks:
  local-kong-net:
    driver: bridge

Steps:

  1. run docker compose up
  2. run the following configuration:
    curl -X POST http://localhost:8001/services \
    --data "name=openai" \
    --data "url=https://api.openai.com"
    
    curl -X POST http://localhost:8001/routes \
    --data "service.id=$(curl -s http://localhost:8001/services/openai | jq -r '.id')" \
    --data name=openai \
    --data "paths[]=/openai"
    
    curl -X POST http://localhost:8001/services/openai/plugins \
     --header 'Content-Type: application/json' \
     --header 'accept: application/json' \
     --data '{
    "name": "ai-proxy",
    "instance_name": "openai",
    "config": {
      "route_type": "llm/v1/chat",
      "model": {
        "provider": "openai"
      },
      "auth": {
            "header_value": "Bearer <OPENAI_KEY>",
            "header_name": "Authorization"
        }
    }
     }'
  3. very streaming works:
     curl -X POST http://localhost:8000/openai/v1/chat/completions \
      -H 'Content-Type: application/json' \
      --data-raw '{"model": "gpt-4o", "messages": [{"role": "user", "content": "What is deep learning?"}], "temperature": 0.7, "stream": true, "max_tokens": 100}'
    
  4. change the kong image to 3.9.0, run the migrations and then docker compose up
  5. re-run the same of point 3, kong will return all SSE at once, and also log (note that 3.8.0 does not have this warning):
    kong-1     | 2025/04/18 17:52:53 [warn] 1405#0: *5218 an upstream response is buffered to a temporary file /usr/local/kong/proxy_temp/4/00/0000000004 while reading upstream, client: 172.19.0.1
    
  6. disable the ai-proxy plugin and add the open key as bearer: -H "Authorization: Bearer $OPEN_AI_KEY"
  7. re-run the same of point 3, streaming will work.

Anything else?

While the steps are for openai, I verified the same behavior with multiple providers (including bedrock and self-hosted).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions