Skip to content

fix: ai-gateway exits 127 in all-in-one image (#5657)#5661

Open
ashishkr96 wants to merge 1 commit into
Helicone:mainfrom
ashishkr96:fix/issue-5657-ai-gateway-binary
Open

fix: ai-gateway exits 127 in all-in-one image (#5657)#5661
ashishkr96 wants to merge 1 commit into
Helicone:mainfrom
ashishkr96:fix/issue-5657-ai-gateway-binary

Conversation

@ashishkr96
Copy link
Copy Markdown

@ashishkr96 ashishkr96 commented Apr 25, 2026

Summary

Fixes #5657docker run helicone/helicone-all-in-one:latest fails because the ai-gateway supervisord program loops with exit status 127 until it hits FATAL.

Root cause

supervisord.conf runs the gateway with:

[program:ai-gateway]
command=yarn start
directory=/app/gateway

This was added in d8bf4ca, but it has two problems:

  1. ai-gateway is a Rust binary, not a Node project — yarn start was never the right entry point.
  2. The Dockerfile additions from that same commit (COPY --from=helicone/ai-gateway:latest /app /app/gateway, the second COPY supervisord.conf, the monitor_logs.sh/debug_jawn.sh/health_check.sh copies, and the 8788 in EXPOSE) are no longer in Dockerfile, so /app/gateway doesn't exist in the published image at all.

Result: supervisord tries cd /app/gateway && yarn start, the directory doesn't exist, the process exits with 127, supervisord retries, gives up: gave up: ai-gateway entered FATAL state, too many start retries too quickly. The other commenter's "wait for DB" theory doesn't fit — the gateway exits in <1 s before any DB connect would be attempted, and a Rust DB connect failure wouldn't surface as 127.

Fix

  • Dockerfile: copy the published ai-gateway binary (/usr/local/bin/ai-gateway per the upstream Helicone/ai-gateway Dockerfile) into the all-in-one image.
  • supervisord.conf: invoke the binary directly; drop the bogus yarn start and the missing directory.
  • Add 8788 to EXPOSE to document the gateway port.

Test plan

  • docker buildx build -t helicone-all-in-one:test . succeeds
  • docker run --rm -p 3000:3000 -p 8585:8585 -p 9080:9080 helicone-all-in-one:testai-gateway no longer enters FATAL; docker exec <id> supervisorctl status ai-gateway shows RUNNING
  • curl http://localhost:8788 from inside the container responds (port not forwarded by default; use docker exec)
  • Other services (web, jawn, postgres, clickhouse, minio) still come up

The all-in-one supervisord program for ai-gateway was running `yarn start`
in `/app/gateway`, but ai-gateway is a Rust binary and `/app/gateway` is
not present in the current Dockerfile, so supervisord failed with exit
status 127 in a tight retry loop until it gave up (FATAL).

Pull `/usr/local/bin/ai-gateway` from the published `helicone/ai-gateway`
image and invoke it directly from supervisord. Also expose 8788 to match
the program's PORT env.

Fixes Helicone#5657

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 25, 2026

@ashishkr96 is attempting to deploy a commit to the Helicone Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: docker image broken?

1 participant