Allow for hot-reloading yaml config pipelines #2236

nickchomey · 2025-04-04T01:38:08Z

Description

This PR adds functionality to allow for upserting (update if exists, otherwise create) and hot-reloading (graceful shutdown of running pipeline) of yaml-config pipelines - no need to shut down the entire server anymore. This will enable the use of a file watcher to solve #1875

It also accepts a yaml string, rather than just a yaml file, which will allow for me to use yaml configs in the NATS api that I am working on (and you could add a similar endpoint for the grpc and http apis if you like), without any files at all.

I don't think this has any side effects on Conduit, and as far as I can tell it wont cause any of its own issues. However, I assume there's plenty of ways in which this could be improved - be it refactoring, allowing other existing functionality to leverage this, etc...

It does not add any api endpoints, schema etc for using this - i think that's beyond my pay grade ;)

Any feedback and guidance to improve this would be quite appreciated!

Quick checks

I have followed the Code Guidelines.
There is no other pull request for the same update/change.
I have written unit tests.
I have made sure that the PR is of reasonable size and can be easily reviewed.

pkg/provisioning/service_test.go

nickchomey · 2025-04-06T00:59:30Z

It just occurred to me that this surely doesn't handle deleting a pipeline config.

I suppose whatever watcher is used (filesystem, nats kv etc...) can probably trigger a delete operation of some sort, then could explicitly call s.Delete rather than UpsertYaml?

edit: yes, we can just do that

hariso · 2025-04-24T11:36:57Z

@nickchomey can you explain briefly how you actually connect NATS and Conduit, so that NATS is delivering the pipeline configs to Conduit?

The team talked about this yesterday, and it appears to us that the goal is actually to be able to "refresh" a pipeline configuration or add a new one, without having to restart the Conduit service. If that's true, then we find that useful for sure and it would be a really useful feature in Conduit.

However, watching a file might not be the best way, because it may trigger unnecessary updates (e.g. an editor might be auto-saving while you're working on a file, and there might be other reasons too). Automated updates can backfire pretty easily, which is why we think that the actual reload should be triggered by a user. Currently, that trigger is a service restart. We propose to have an API endpoint that a user can use to reload a pipeline configure file(s).

In other words the flow would be:

A pipeline configuration file is added/updated/removed.
User calls /v1/pipelines/reload (just an example) that reloads either all or just targeted pipelines.

The implementation would reuse most of your code; we'd just add an API trigger that executes the code you wrote.

I can open a discussion where we can discuss the details. What do you think about this? Would it be helpful for you?

nickchomey · 2025-04-24T12:13:54Z

@hariso
Glad to hear there's interest in this!

Yes, that's a good point about editor auto save triggering an update - definitely something to be avoided. Your proposed endpoint seems appropriate to me. I'll leave that to you guys to figure out since I don't intend to use the yaml files.

The nats api works by wrapping conduit in another app (just like when we modify the builtin connectors/processors), which has a nats client that connects to either

Embedded nats server via the nats micro api
embedded nats server via jetstream (allows persistence)
External nats + micro
external nats + jetstream

I had started with micro and then realized I actually wanted jetstream, but kept it around. I'll use either external nats server or perhaps even embedded server as a leaf node that connects to an external cluster (leaf node functionality not yet built)

All of the options expose all of the same endpoints as the conduit http api - in fact, all you do is send something like

nats pub pipeline.create "{same json payload as conduit api}"

and the nats wrapper just calls the http api endpoint functions and uses the same validation.

Again, I built that before I realized that what I really wanted was to just be able to refresh yaml configs, but kept it around.

Then i added one more "endpoint" called pipeline.upsert that receives a yaml string and just calls the upsertYaml function shown in this PR. It stops and removes the pipeline if it exists, readds it, and restarts it if configured to do so. It also stores the config in nats kv, such that when the entire application is restarted (and after conduit has loaded its own api and yaml pipelines) the wrapper actually scans the kv store for pipelines and does the upsert for each one (since they haven't been persisted by conduit api, nor are loaded by the yaml file mechanism).

Bringing this back to the current PR and your question, I suppose for consistency's sake, it would be ideal if the upsertyaml nats endpoint were to have an actual upsertyaml api endpoint that I could call rather than the function in this PR directly. But it would only really be useful by a wrapper app, because of the persistence and reload stuff...

I'll share the nats api with you guys soon enough - I'm just trying to tidy things up so as to minimize the inevitable embarrassment from whatever dumb stuff I've done with it.

But, i suppose the true solution for all of this would be to do as lovro described in #2232, and unify the api and yaml files, and add import/export endpoints which presumably would accept yaml configs? I see that issue has been added to the next milestone - does that mean it might happen soon?

Sorry, this wasn't brief at all, but hopefully helpful?

I'm happy to discuss any of this more here, in a GH discussion or discord.

nickchomey requested a review from a team as a code owner April 4, 2025 01:38

nickchomey marked this pull request as draft April 4, 2025 02:16

nickchomey marked this pull request as ready for review April 4, 2025 03:12

nickchomey commented Apr 4, 2025

View reviewed changes

pkg/provisioning/service_test.go Outdated Show resolved Hide resolved

nickchomey added 3 commits April 4, 2025 11:35

add functionlity to hot reload yaml configs

a643c2c

change reloadYaml to upsertYaml

807e085

tweaks to test

4c30e9a

nickchomey force-pushed the reloadYaml branch from e2cae2b to 4c30e9a Compare April 4, 2025 17:35

upsert multiple pipelines in single config

589d265

nickchomey mentioned this pull request Apr 17, 2025

Feature: Implement ImportPipeline and ExportPipeline functions in the API #2232

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow for hot-reloading yaml config pipelines #2236

Allow for hot-reloading yaml config pipelines #2236

nickchomey commented Apr 4, 2025 •

edited

Loading

nickchomey commented Apr 6, 2025 •

edited

Loading

hariso commented Apr 24, 2025 •

edited

Loading

nickchomey commented Apr 24, 2025 •

edited

Loading

Allow for hot-reloading yaml config pipelines #2236

Are you sure you want to change the base?

Allow for hot-reloading yaml config pipelines #2236

Conversation

nickchomey commented Apr 4, 2025 • edited Loading

Description

Quick checks

nickchomey commented Apr 6, 2025 • edited Loading

hariso commented Apr 24, 2025 • edited Loading

nickchomey commented Apr 24, 2025 • edited Loading

nickchomey commented Apr 4, 2025 •

edited

Loading

nickchomey commented Apr 6, 2025 •

edited

Loading

hariso commented Apr 24, 2025 •

edited

Loading

nickchomey commented Apr 24, 2025 •

edited

Loading