Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out a minimal healthcheck pattern #31619

Open
phlax opened this issue Jan 3, 2024 · 9 comments
Open

Figure out a minimal healthcheck pattern #31619

phlax opened this issue Jan 3, 2024 · 9 comments
Labels
area/docker area/health_checking enhancement Feature requests. Not bugs or questions. no stalebot Disables stalebot from closing an issue

Comments

@phlax
Copy link
Member

phlax commented Jan 3, 2024

Some of our examples use a healthcheck which relies on the admin endpoint, and uses curl to check readiness

Ideally we want a really minimal healthcheck that requires little/no container/system tools to check, that we could include in the published images - adding curl/nc in the distroless container is not trivial/desirable

One way of doing this would be if there were a filesystem file that could be checked for. This would definitely be the ideal

Another possibility - which would still require admin and a util - would be to expose admin only on a unix socket and connect to that with socat

I have a current need for something like this - but there is also a recent ticket related to this - #31614

@phlax phlax added enhancement Feature requests. Not bugs or questions. area/health_checking labels Jan 3, 2024
@phlax
Copy link
Member Author

phlax commented Jan 3, 2024

cc @bruegth

@phlax
Copy link
Member Author

phlax commented Jan 3, 2024

the other minimal option would be sending a signal i guess

@phlax
Copy link
Member Author

phlax commented Jan 4, 2024

thinking about this further and realizing that signals wont help as they are one way

@bruegth
Copy link

bruegth commented Jan 5, 2024

I'm using the file based http healthcheck on other distroless images too, currently without any problems.

dockerfile:
HEALTHCHECK CMD ["/docker-healthcheck.sh"]

docker-healthcheck.sh.txt

But this requires /bin/bash as shell and it depends that envoy admin endpoint is unsecure (not using https)

@phlax
Copy link
Member Author

phlax commented Jan 5, 2024

nice, that gets rid of the need for adding a tool other than bash (defo better than my socket suggestion)

i guess the next question is a reliable readiness check

when i looked before the admin endpoint was the most reliable i could see but its generally not desirable to expose admin unless you need it

there is also some question about its reliability for health checking - cc @ravenblackx

i guess there is a more general question about what readiness means in the Envoy context (putting aside healthiness) - eg does it mean the server is running, or specific endpoints are listening

@phlax
Copy link
Member Author

phlax commented Jan 5, 2024

@bruegth if you were up for PRing to replace the curl/healthcheck in the examples with your one i think that would be a good advance on what we have (happy to follow up myself if you prefer)

@phlax
Copy link
Member Author

phlax commented Jan 10, 2024

based on your suggestion and a bit of experimentaion - the best ive got so far is this for distroless

FROM debian:bookworm-slim as debian-static
RUN apt-get -qq update -y \
    && apt-get install bash-static


FROM scratch as binaries
COPY --from=busybox /bin/cat /bin/cat
COPY --from=busybox /bin/echo /bin/echo
COPY --from=busybox /bin/sh /bin/sh
COPY --from=busybox /bin/timeout /bin/timeout
COPY --from=busybox /bin/grep /bin/grep
COPY --from=debian-static /usr/bin/bash-static /usr/bin/bash


FROM envoyproxy/envoy:distroless-dev
COPY --from=binaries \
    /bin/cat \
    /bin/echo \
    /bin/grep \
    /bin/sh \
    /bin/timeout \
    /bin/
COPY --from=binaries \
    /usr/bin/bash \
    /usr/bin/
ENV ENVOY_ADMIN_PORT="${ENVOY_ADMIN_PORT:-9901}" \
    HEALTHCHECK_PORT="${HEALTHCHECK_PORT:-${ENVOY_ADMIN_PORT}}"
COPY --chmod=755 <<EOF /usr/local/bin/healthcheck
#!/usr/bin/bash

HC_HOST="localhost"
HC_PORT="\${HEALTHCHECK_PORT:-9901}"
HC_PATH="/server_info"
HC_EXPECTED="LIVE"
HC_ENDPOINT="/dev/tcp/\${HC_HOST}/\${HC_PORT}"

exec 3<>"\$HC_ENDPOINT"
echo -e "GET "\${HC_PATH}" HTTP/1.1\\nhost: \${HC_HOST}:\${HC_PORT}\\n" >&3
timeout 1 cat <&3 | grep -m 1 state | grep -m 1 "\$HC_EXPECTED"
ERROR=\$?
exec 3<&-
exec 3>&-
exit \$ERROR
EOF

HEALTHCHECK \
    --interval=1s \
    --timeout=2s \
    --start-period=1s \
    --retries=3 \
    CMD /usr/local/bin/healthcheck

for distroless it would need these bins added , but the ubuntu container should be able to just use the hc directly

i think we dont want to put anyting in the distroless image itself so probs we should add another image target :distroless-hc or similar

@bruegth
Copy link

bruegth commented Jan 11, 2024

Your solution looks good to me.

@bruegth if you were up for PRing to replace the curl/healthcheck in the examples with your one i think that would be a good advance on what we have (happy to follow up myself if you prefer)

Sorry, but at moment I'm too busy to create an PR.

Copy link

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Feb 10, 2024
@phlax phlax added no stalebot Disables stalebot from closing an issue and removed stale stalebot believes this issue/PR has not been touched recently labels Feb 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docker area/health_checking enhancement Feature requests. Not bugs or questions. no stalebot Disables stalebot from closing an issue
Projects
None yet
Development

No branches or pull requests

2 participants