-
Notifications
You must be signed in to change notification settings - Fork 86
Enable dynamic changes in the router for non-kube environments #2356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
fgiorgetti
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My impression so far is that it is on the right track.
Really nice work @nluaces!
A few other comments I have so far:
- The site name in the router config is changing (we need to check if this will cause issues to the console or vanflow metrics)
- As I already added, the system controller update and the update of the router container for the controlled namespaces is something we need to figure out still
| endpoint = "unix:///var/run/podman.sock" | ||
| if s.Platform == "docker" { | ||
| endpoint = "unix:///run/docker.sock" | ||
| endpoint = "unix:///var/run/docker.sock" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the original one is still correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| // the container endpoint is mapped to the podman socket inside the container | ||
| if api.IsRunningInContainer() { | ||
| endpoint = "unix:///var/run/podman.sock" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If running in a container, the endpoint should be provided through:
os.Getenv("CONTAINER_ENDPOINT")
Then, if the endpoint is not set, you can use the defaults you have below.
Maybe we could have a function under pkg/nonkube/api/environment.go that takes the platform as an argument and returns the correct container endpoint, eliminating the dups we have (I should have done that earlier).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When skupper system install runs, it creates a volume bind between with CONTAINER_ENDPOINT env variable and a volume destination that starts with /var/run/ (I believe that the bootstrap script works like that as well):
skupper/internal/nonkube/bootstrap/install.go
Lines 80 to 104 in ce10f83
| //To mount a volume as a bind, the host path must be specified in the Name field | |
| //instead of the Source field. If the values Name/Destination are empty, volumes will be ignored, not mounted, | |
| //and the system-controller container will fail to start. | |
| mounts := []container.Volume{} | |
| volumeDestination := fmt.Sprintf("/var/run/%s.sock", platform) | |
| if strings.HasPrefix(config.containerEndpoint, "unix://") { | |
| socketPath := strings.TrimPrefix(config.containerEndpoint, "unix://") | |
| mounts = append(mounts, container.Volume{ | |
| Name: socketPath, | |
| Destination: volumeDestination, | |
| Mode: "z", | |
| RW: true, | |
| }) | |
| } else if strings.HasPrefix(config.containerEndpoint, "/") { | |
| mounts = append(mounts, container.Volume{ | |
| Name: config.containerEndpoint, | |
| Destination: volumeDestination, | |
| Mode: "z", | |
| RW: true, | |
| }) | |
| } |
If running inside the container we use the env variable instead of the mapped endpoint, I think it is not going to work as expected.
| defer h.lock.Unlock() | ||
|
|
||
| _, err := os.Stat(api.GetInternalOutputPath(h.namespace, api.RuntimeSiteStatePath)) | ||
| if err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe an extra validation is needed here.
If the namespace has already been initialized, the controller needs to check if the
router container is using the correct image. If not, then we need to "reload" the site.
This will be needed in cases when the system controller is updated.
By the way, we still need to handle system controller updates through the CLI.
I am not sure if it could be part of the system install command, through a flag,
or if it should be a separate command like reinstall.
Anyway, I believe we need to think about it, as it will be needed as existing sites
cannot be reloaded when using automated reloads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the namespace has already been initialized, the controller needs to check if the
router container is using the correct image.
I need some clarification, what is considered the correct image? the one specified in the env variable? Or simply check that there is a router running for that namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A general comment here. The amqp Agent originally prints its log messages to its own kube-adaptor container.
Now that it is running as part of the system-controller container, the messages being logged are not qualified for the namespace, so it is hard to tell for which namespace the operations are being executed against. In example:
2026/02/06 19:08:46 CREATE io.skupper.router.tcpListener backend map[address:backend host:0.0.0.0 name:backend port:8080]
Maybe rebasing this PR on top of the main branch, which already contains the work done by @JPadovano1483 to use slog instead could help (but we'd still need to add the namespace attribute).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have rebased this PR; after merging this, I'm planning to do some refactoring like using sync_router_ops for both kube-adaptor and system-adaptor, in order to not duplicate code: https://github.com/nluaces/skupper/blob/d4bc626eb701ba734864ebecf9ab3e61be0b58ee/internal/qdr/sync_router_ops.go
I can change the amqp Agent functions to add the namespace as part of this refactoring. What do you think?
…e there is automatic reloading configured
0eb07fd to
d4bc626
Compare
I checked that the name of the site was not changing when the controller just refresh (for example, when a listener has been created); do you have any examples for me to understand this better? |
Implements #2337
Branch based on #2334, the proper commits that implement this issue are bb8ab35 and the following.
flowchart TD id1([new listener resource is created])-->|triggers|InputResourceHandler-->Q{site exists?} Q -->|Yes| A[Refresh Router Config] Q -->|No| B[Bootstrap]The InputResourceHandler was modified to not bootstrap in case a site is already created in the namespace.
It also includes validations for commands start, stop and reload, that should be disabled if the dynamic changes are happening.