Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chore]: fix linux packaging tests #30202

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions .github/workflows/build-and-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -458,8 +458,7 @@ jobs:
path: ./bin/*

build-package:
# Use 20.04.5 until https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/16450 is resolved
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
needs: [cross-compile]
strategy:
fail-fast: false
Expand Down
14 changes: 7 additions & 7 deletions internal/buildscripts/packaging/fpm/common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@ docker_cp() {
local dest_dir="$( dirname "$dest" )"

echo "Copying $src to $container:$dest ..."
docker exec $container mkdir -p "$dest_dir"
docker cp "$src" $container:"$dest"
podman exec $container mkdir -p "$dest_dir"
podman cp "$src" $container:"$dest"
}

install_pkg() {
Expand All @@ -43,9 +43,9 @@ install_pkg() {
echo "Installing $pkg_base ..."
docker_cp $container "$pkg_path" /tmp/$pkg_base
if [[ "${pkg_base##*.}" = "deb" ]]; then
docker exec $container dpkg -i /tmp/$pkg_base
podman exec $container dpkg -i /tmp/$pkg_base
else
docker exec $container rpm -ivh /tmp/$pkg_base
podman exec $container rpm -ivh /tmp/$pkg_base
fi
}

Expand All @@ -56,8 +56,8 @@ uninstall_pkg() {

echo "Uninstalling $pkg_name ..."
if [[ "$pkg_type" = "deb" ]]; then
docker exec $container dpkg -r $pkg_name
podman exec $container dpkg -r $pkg_name
else
docker exec $container rpm -e $pkg_name
podman exec $container rpm -e $pkg_name
fi
}
}
10 changes: 1 addition & 9 deletions internal/buildscripts/packaging/fpm/deb/Dockerfile.test
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,6 @@ ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update ; \
apt-get install -y systemd systemd-sysv procps; \
apt-get clean ; \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* ; \
rm -rf /lib/systemd/system/multi-user.target.wants/* ; \
rm -rf /etc/systemd/system/*.wants/* ; \
rm -rf /lib/systemd/system/local-fs.target.wants/* ; \
rm -rf /lib/systemd/system/sockets.target.wants/*udev* ; \
rm -rf /lib/systemd/system/sockets.target.wants/*initctl* ; \
rm -rf /lib/systemd/system/sysinit.target.wants/systemd-tmpfiles-setup* ; \
rm -rf /lib/systemd/system/systemd-update-utmp*
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

VOLUME [ "/sys/fs/cgroup" ]
CMD ["/lib/systemd/systemd"]
11 changes: 0 additions & 11 deletions internal/buildscripts/packaging/fpm/rpm/Dockerfile.test
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,4 @@ ENV container docker

RUN dnf install -y initscripts

RUN (cd /lib/systemd/system/sysinit.target.wants/; for i in *; do [ $i = \
"systemd-tmpfiles-setup.service" ] || rm -f $i; done); \
rm -f /lib/systemd/system/multi-user.target.wants/*;\
rm -f /lib/systemd/system/local-fs.target.wants/*; \
rm -f /lib/systemd/system/sockets.target.wants/*udev*; \
rm -f /lib/systemd/system/sockets.target.wants/*initctl*; \
rm -f /lib/systemd/system/basic.target.wants/*;\
rm -f /lib/systemd/system/anaconda.target.wants/*;

VOLUME [ "/sys/fs/cgroup" ]

CMD ["/usr/sbin/init"]
21 changes: 10 additions & 11 deletions internal/buildscripts/packaging/fpm/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -30,47 +30,46 @@ if [[ ! "$pkg_type" =~ ^(deb|rpm)$ ]]; then
fi
image_name="otelcontribcol-$pkg_type-test"
container_name="$image_name"
docker_run="docker run --name $container_name -d -v /sys/fs/cgroup:/sys/fs/cgroup:ro --privileged $image_name"
docker_exec="docker exec $container_name"
container_exec="podman exec $container_name"

trap "docker rm -fv $container_name >/dev/null 2>&1 || true" EXIT
trap "podman rm -fv $container_name >/dev/null 2>&1 || true" EXIT

docker build -t $image_name -f "$SCRIPT_DIR/$pkg_type/Dockerfile.test" "$SCRIPT_DIR"
docker rm -fv $container_name >/dev/null 2>&1 || true
podman build -t $image_name -f "$SCRIPT_DIR/$pkg_type/Dockerfile.test" "$SCRIPT_DIR"
podman rm -fv $container_name >/dev/null 2>&1 || true

# test install
echo
$docker_run
podman run --name $container_name -d $image_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your comment here, you said there should be a check to ensure that the run does not silently fail. Will podman fail here if the container fails to start or should we still add an additional check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, podman-run will return a non-zero exit code ala chroot, 125 - podman command itself failed, 126 - the contained command failed to run, 127 - the contained command could not be found. (These are the same as docker-run should return)

However, the original issue in #16450 is actually the result of the initial docker-run to succeeding, but then the container immediately shuts off again due to unknown reasons (apart from the reason that Docker and SystemD have been at loggerheads for years and expecting the two to run smoothly together is not an option at this point in time)

So, going back to my original comment in your question. Yes, there should be a health-check command running inside the otelcolcontrib-*-test images that waits until SystemD has finished starting up all services before returning a 'healthy' state. Then the test.sh bash script could wait for the container to transition into a healthy state after podman-run and before podman-cp, which would cover for that period of uncertainty where the container started, but might have not fully initialized as expected.

I think this should be part of a separate patch though. I'll start working on adding that health check function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough answer. If this PR gets merged before the health check function PR is submitted please create a new tracking issue for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a tracking issue for the health check issue. #30239

install_pkg $container_name "$PKG_PATH"

# ensure service has started and still running after 5 seconds
sleep 5
echo "Checking $SERVICE_NAME service status ..."
$docker_exec systemctl --no-pager status $SERVICE_NAME
$container_exec systemctl --no-pager status $SERVICE_NAME

echo "Checking $PROCESS_NAME process ..."
$docker_exec pgrep -a -u otel $PROCESS_NAME
$container_exec pgrep -a -u otel $PROCESS_NAME

# test uninstall
echo
uninstall_pkg $container_name $pkg_type

echo "Checking $SERVICE_NAME service status after uninstall ..."
if $docker_exec systemctl --no-pager status $SERVICE_NAME; then
if $container_exec systemctl --no-pager status $SERVICE_NAME; then
echo "$SERVICE_NAME service still running after uninstall" >&2
exit 1
fi
echo "$SERVICE_NAME service successfully stopped after uninstall"

echo "Checking $SERVICE_NAME service existence after uninstall ..."
if $docker_exec systemctl list-unit-files --all | grep $SERVICE_NAME; then
if $container_exec systemctl list-unit-files --all | grep $SERVICE_NAME; then
echo "$SERVICE_NAME service still exists after uninstall" >&2
exit 1
fi
echo "$SERVICE_NAME service successfully removed after uninstall"

echo "Checking $PROCESS_NAME process after uninstall ..."
if $docker_exec pgrep $PROCESS_NAME; then
if $container_exec pgrep $PROCESS_NAME; then
echo "$PROCESS_NAME process still running after uninstall"
exit 1
fi
Expand Down