Skip to content

Conversation

@TomerFi
Copy link
Contributor

@TomerFi TomerFi commented Oct 24, 2025

This PR adds Dockerfile-based build and push processes for both the Operator and ProxyRunner images. These UBI-based images are passing OpenShift-Preflight tests by introducing a more secure base image.

@TomerFi TomerFi force-pushed the containerfile-preflight branch from 874deb3 to 4f9dc0f Compare October 24, 2025 16:24
@JAORMX
Copy link
Collaborator

JAORMX commented Oct 24, 2025

@TomerFi I was more thinking about creating additional UBI images, not replacing the ones we have.

@JAORMX
Copy link
Collaborator

JAORMX commented Oct 24, 2025

Perhaps we could have a <version>-ubi tag for the images instead?

@TomerFi
Copy link
Contributor Author

TomerFi commented Oct 24, 2025

I was more thinking about creating additional UBI images, not replacing the ones we have.

@JAORMX , I'm not sure I follow, do you want two build processes to exist simultaneously? One using ko.build and one using Dockerfile?

@JAORMX
Copy link
Collaborator

JAORMX commented Oct 24, 2025

@TomerFi that's right

@TomerFi
Copy link
Contributor Author

TomerFi commented Oct 24, 2025

@JAORMX May I ask why?

@JAORMX
Copy link
Collaborator

JAORMX commented Oct 24, 2025

@TomerFi because I would like to keep the ko builds which use a more minimal container image.

@dmartinol
Copy link
Collaborator

@TomerFi I was more thinking about creating additional UBI images, not replacing the ones we have.

driven by env vars or always 2 images per application? in the first case, what's the default?
BUILD_TOOL=ko Vs BUILD_TOOL=dockerfile

@JAORMX
Copy link
Collaborator

JAORMX commented Oct 24, 2025

So, I was thinking of having:

ghcr.io/stacklok/toolhive-operator:v<version>      # ko-based image
ghcr.io/stacklok/toolhive-operator:v<version>-ubi  # UBI based image

How this happens is not super important to me. It could be another workflow that happens after the main build depending on the main release workflow. perhaps?

@codecov
Copy link

codecov bot commented Oct 24, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 54.23%. Comparing base (5805898) to head (9529a17).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2327      +/-   ##
==========================================
- Coverage   54.27%   54.23%   -0.04%     
==========================================
  Files         242      242              
  Lines       23446    23446              
==========================================
- Hits        12725    12716       -9     
- Misses       9506     9520      +14     
+ Partials     1215     1210       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jhrozek
Copy link
Contributor

jhrozek commented Oct 24, 2025

So, I was thinking of having:

ghcr.io/stacklok/toolhive-operator:v<version>      # ko-based image
ghcr.io/stacklok/toolhive-operator:v<version>-ubi  # UBI based image

How this happens is not super important to me. It could be another workflow that happens after the main build depending on the main release workflow. perhaps?

I like this scheme. As upstream, we should commit to maintaining UBI images - OpenShift is a very important downstream. But at the same I'd like us to keep the ko-based images as they are minimal, small and distroless.

As far as making it easier for clients to find and distinguish the images, perhaps we could add OCI image labels, e.g. dev.stacklok.toolhive.image-variant: UBI.

And I guess it would be nice to have a helm chart variable to easily switch between them.

(those can be another PR, just thinking out loud how to make life easier for consumers)

@TomerFi TomerFi force-pushed the containerfile-preflight branch 2 times, most recently from 3d74330 to 33d0b8e Compare October 27, 2025 19:55
@TomerFi TomerFi changed the title Switch operator to Dockerfile to pass preflight tests Add Dockerfile UBI images Oct 27, 2025
@TomerFi TomerFi changed the title Add Dockerfile UBI images Add Dockerfile UBI images for the Operator and ProxyRunner Oct 27, 2025
@TomerFi
Copy link
Contributor Author

TomerFi commented Oct 27, 2025

@JAORMX @jhrozek ^^

cmds:
- >
eval "{{.CONTAINER_RUNTIME}} build --load
-t ghcr.io/stacklok/toolhive/operator:local-ubi
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the non-ubi name is ko.local/thv-operator:<HASH>, should we use the same name and tag here? @jhrozek

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong preference, it depends on how the target will be used.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW: is it fine to keep using the ghcr.io/stacklok/toolhive registry, right? I assume you don't want to go to quay.io

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's up to you - if quay needs some elaborate auth schemes to push, you'd be on the hook to maintain the secrets but otherwise I don't care :)

Copy link
Contributor Author

@TomerFi TomerFi Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmartinol About the tag, fixed in ac773b2.
About Quay, let's maintain GHCR as the registry until we need to address Quay.

(please resolve if done).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I resolved the other two, I'll leave this one to @dmartinol I think he has a firmer opinion on what "good" looks like here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jhrozek If we don't use the UBI images in the operator-deploy-local task, I'm wondering if we actually need any changes to this file. Maybe the changes to the CI workflows are sufficient instead? (I know this would make things more complex for testing UBIs locally)
Or, in alternative: would it be possible to reuse the task definitions in the CI workflows instead of duplicating the build code?
(@TomerFi I apologize in advance for these last-minute, disturbing comments!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion, but also taking the other suggestion into account, I wonder if it would be best to keep these task targets, change to build just toolhive-operator:local-ubi (dropping the prefix) but also integrating into the existing ocp-build-and-push target:

  ocp-build-and-push:
    desc: Build UBI images and push them to OpenShift registry
    cmds:
      - task: build-operator-image-ubi
      - task: build-proxyrunner-image-ubi
      - task: ocp-registry-login

(dropping the tag is not needed, but I just wonder if the short name is cleaner. I really don't mind one way or the other).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thought, can be done in a follow-up:
Should we change deploy/charts/operator/values-openshift.yaml to use these images?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thought, can be done in a follow-up: Should we change deploy/charts/operator/values-openshift.yaml to use these images?

definitely yes, after testing them first!

Copy link
Collaborator

@dmartinol dmartinol Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but also integrating into the existing ocp-build-and-push target:

I forgot this target, yes I also agree but we need to add tasks for pushing them

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be done in a follow-up though to unblock you all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jhrozek @dmartinol
In 9529a17 I modified the ocp-related tasks to work with the new ubi-based images, I tested the following tasks flow against an OpenShift instance:

oc new-project toolhive-system
task ocp-setup-registry-sa
task ocp-registry-login
task ocp-build-and-push
task ocp-verify-push
task ocp-deploy-operator -- --values ./deploy/charts/operator/values-openshift.yaml

And that worked as expected.
By "as expected" I mean the release was up, but the deployment failed because the above tasks don't include the CRDs chart, so the deployment, obviously, fails to fetch the non-existing API:

{"level":"error","ts":1761695126.5817027,"logger":"setup","caller":"thv-operator/main.go:94","msg":"unable to create field index for spec.groupRef","error":"unable to retrieve the complete list of server APIs: toolhive.stacklok.dev/v1alpha1: no matches for toolhive.stacklok.dev/v1alpha1, Resource=","stacktrace":"main.main\n\t/workspace/cmd/thv-operator/main.go:94\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:283"}

For the work I'm doing in this PR, this is a good outcome, as long as the deployment is up following the above steps, that means this works. Unless I'm missing a task, the ocp-deploy-operator by itself, can't work on OpenShift.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P.S.
I'll update the images in the helm charts in a different PR after a succesful images build.

Copy link
Contributor

@jhrozek jhrozek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this look good, I wouldn't mind pushing this as long as it keeps moving you forward. I wasn't sure about the LDFLAGS invocation though and left a question.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion, but also taking the other suggestion into account, I wonder if it would be best to keep these task targets, change to build just toolhive-operator:local-ubi (dropping the prefix) but also integrating into the existing ocp-build-and-push target:

  ocp-build-and-push:
    desc: Build UBI images and push them to OpenShift registry
    cmds:
      - task: build-operator-image-ubi
      - task: build-proxyrunner-image-ubi
      - task: ocp-registry-login

(dropping the tag is not needed, but I just wonder if the short name is cleaner. I really don't mind one way or the other).

cmds:
- >
eval "{{.CONTAINER_RUNTIME}} build --load
-t ghcr.io/stacklok/toolhive/operator:local-ubi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong preference, it depends on how the target will be used.

@TomerFi TomerFi force-pushed the containerfile-preflight branch from 2b2bc35 to ac773b2 Compare October 28, 2025 15:24
Signed-off-by: Tomer Figenblat <tfigenbl@redhat.com>
Signed-off-by: Tomer Figenblat <tfigenbl@redhat.com>
@TomerFi TomerFi force-pushed the containerfile-preflight branch 3 times, most recently from e511a36 to e9b313b Compare October 28, 2025 21:12
Signed-off-by: Tomer Figenblat <tfigenbl@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants