Skip to content

Conversation

@nluaces
Copy link
Member

@nluaces nluaces commented Jan 18, 2026

Implements #2337
Branch based on #2334, the proper commits that implement this issue are bb8ab35 and the following.

flowchart TD
   id1([new listener resource is created])-->|triggers|InputResourceHandler-->Q{site exists?}
   Q -->|Yes| A[Refresh Router Config]
   Q -->|No| B[Bootstrap]
Loading
flowchart LR
  SystemAdaptorHandler-->|has a callback|RouterStatusHandler
  SystemAdaptorHandler-->|spawns|processRouterConfig-->id2([reconciliates the router config file with the router])
Loading

The InputResourceHandler was modified to not bootstrap in case a site is already created in the namespace.
It also includes validations for commands start, stop and reload, that should be disabled if the dynamic changes are happening.

@nluaces nluaces self-assigned this Jan 18, 2026
@nluaces nluaces changed the title Enable dynamic changes in the router for non-kube environments (wip) Enable dynamic changes in the router for non-kube environments Feb 1, 2026
@nluaces nluaces marked this pull request as ready for review February 1, 2026 17:31
Copy link
Member

@fgiorgetti fgiorgetti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My impression so far is that it is on the right track.
Really nice work @nluaces!

A few other comments I have so far:

  • The site name in the router config is changing (we need to check if this will cause issues to the console or vanflow metrics)
  • As I already added, the system controller update and the update of the router container for the controlled namespaces is something we need to figure out still

endpoint = "unix:///var/run/podman.sock"
if s.Platform == "docker" {
endpoint = "unix:///run/docker.sock"
endpoint = "unix:///var/run/docker.sock"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the original one is still correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// the container endpoint is mapped to the podman socket inside the container
if api.IsRunningInContainer() {
endpoint = "unix:///var/run/podman.sock"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If running in a container, the endpoint should be provided through:

os.Getenv("CONTAINER_ENDPOINT")

Then, if the endpoint is not set, you can use the defaults you have below.

Maybe we could have a function under pkg/nonkube/api/environment.go that takes the platform as an argument and returns the correct container endpoint, eliminating the dups we have (I should have done that earlier).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When skupper system install runs, it creates a volume bind between with CONTAINER_ENDPOINT env variable and a volume destination that starts with /var/run/ (I believe that the bootstrap script works like that as well):

//To mount a volume as a bind, the host path must be specified in the Name field
//instead of the Source field. If the values Name/Destination are empty, volumes will be ignored, not mounted,
//and the system-controller container will fail to start.
mounts := []container.Volume{}
volumeDestination := fmt.Sprintf("/var/run/%s.sock", platform)
if strings.HasPrefix(config.containerEndpoint, "unix://") {
socketPath := strings.TrimPrefix(config.containerEndpoint, "unix://")
mounts = append(mounts, container.Volume{
Name: socketPath,
Destination: volumeDestination,
Mode: "z",
RW: true,
})
} else if strings.HasPrefix(config.containerEndpoint, "/") {
mounts = append(mounts, container.Volume{
Name: config.containerEndpoint,
Destination: volumeDestination,
Mode: "z",
RW: true,
})
}

If running inside the container we use the env variable instead of the mapped endpoint, I think it is not going to work as expected.

defer h.lock.Unlock()

_, err := os.Stat(api.GetInternalOutputPath(h.namespace, api.RuntimeSiteStatePath))
if err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe an extra validation is needed here.
If the namespace has already been initialized, the controller needs to check if the
router container is using the correct image. If not, then we need to "reload" the site.

This will be needed in cases when the system controller is updated.

By the way, we still need to handle system controller updates through the CLI.
I am not sure if it could be part of the system install command, through a flag,
or if it should be a separate command like reinstall.

Anyway, I believe we need to think about it, as it will be needed as existing sites
cannot be reloaded when using automated reloads.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the namespace has already been initialized, the controller needs to check if the
router container is using the correct image.

I need some clarification, what is considered the correct image? the one specified in the env variable? Or simply check that there is a router running for that namespace?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A general comment here. The amqp Agent originally prints its log messages to its own kube-adaptor container.

Now that it is running as part of the system-controller container, the messages being logged are not qualified for the namespace, so it is hard to tell for which namespace the operations are being executed against. In example:

2026/02/06 19:08:46 CREATE io.skupper.router.tcpListener backend map[address:backend host:0.0.0.0 name:backend port:8080]

Maybe rebasing this PR on top of the main branch, which already contains the work done by @JPadovano1483 to use slog instead could help (but we'd still need to add the namespace attribute).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have rebased this PR; after merging this, I'm planning to do some refactoring like using sync_router_ops for both kube-adaptor and system-adaptor, in order to not duplicate code: https://github.com/nluaces/skupper/blob/d4bc626eb701ba734864ebecf9ab3e61be0b58ee/internal/qdr/sync_router_ops.go

I can change the amqp Agent functions to add the namespace as part of this refactoring. What do you think?

@nluaces nluaces force-pushed the 2337-system-adaptor-process branch from 0eb07fd to d4bc626 Compare February 10, 2026 10:22
@nluaces
Copy link
Member Author

nluaces commented Feb 10, 2026

  • The site name in the router config is changing (we need to check if this will cause issues to the console or vanflow metrics)

I checked that the name of the site was not changing when the controller just refresh (for example, when a listener has been created); do you have any examples for me to understand this better?

@nluaces nluaces requested a review from fgiorgetti February 10, 2026 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants