-
Notifications
You must be signed in to change notification settings - Fork 50
Allow for hot-reloading yaml config pipelines #2236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
It just occurred to me that this surely doesn't handle deleting a pipeline config. I suppose whatever watcher is used (filesystem, nats kv etc...) can probably trigger a delete operation of some sort, then could explicitly call edit: yes, we can just do that |
@nickchomey can you explain briefly how you actually connect NATS and Conduit, so that NATS is delivering the pipeline configs to Conduit? The team talked about this yesterday, and it appears to us that the goal is actually to be able to "refresh" a pipeline configuration or add a new one, without having to restart the Conduit service. If that's true, then we find that useful for sure and it would be a really useful feature in Conduit. However, watching a file might not be the best way, because it may trigger unnecessary updates (e.g. an editor might be auto-saving while you're working on a file, and there might be other reasons too). Automated updates can backfire pretty easily, which is why we think that the actual reload should be triggered by a user. Currently, that trigger is a service restart. We propose to have an API endpoint that a user can use to reload a pipeline configure file(s). In other words the flow would be:
The implementation would reuse most of your code; we'd just add an API trigger that executes the code you wrote. I can open a discussion where we can discuss the details. What do you think about this? Would it be helpful for you? |
@hariso Yes, that's a good point about editor auto save triggering an update - definitely something to be avoided. Your proposed endpoint seems appropriate to me. I'll leave that to you guys to figure out since I don't intend to use the yaml files. The nats api works by wrapping conduit in another app (just like when we modify the builtin connectors/processors), which has a nats client that connects to either
I had started with micro and then realized I actually wanted jetstream, but kept it around. I'll use either external nats server or perhaps even embedded server as a leaf node that connects to an external cluster (leaf node functionality not yet built) All of the options expose all of the same endpoints as the conduit http api - in fact, all you do is send something like
and the nats wrapper just calls the http api endpoint functions and uses the same validation. Again, I built that before I realized that what I really wanted was to just be able to refresh yaml configs, but kept it around. Then i added one more "endpoint" called pipeline.upsert that receives a yaml string and just calls the upsertYaml function shown in this PR. It stops and removes the pipeline if it exists, readds it, and restarts it if configured to do so. It also stores the config in nats kv, such that when the entire application is restarted (and after conduit has loaded its own api and yaml pipelines) the wrapper actually scans the kv store for pipelines and does the upsert for each one (since they haven't been persisted by conduit api, nor are loaded by the yaml file mechanism). Bringing this back to the current PR and your question, I suppose for consistency's sake, it would be ideal if the upsertyaml nats endpoint were to have an actual upsertyaml api endpoint that I could call rather than the function in this PR directly. But it would only really be useful by a wrapper app, because of the persistence and reload stuff... I'll share the nats api with you guys soon enough - I'm just trying to tidy things up so as to minimize the inevitable embarrassment from whatever dumb stuff I've done with it. But, i suppose the true solution for all of this would be to do as lovro described in #2232, and unify the api and yaml files, and add import/export endpoints which presumably would accept yaml configs? I see that issue has been added to the next milestone - does that mean it might happen soon? Sorry, this wasn't brief at all, but hopefully helpful? I'm happy to discuss any of this more here, in a GH discussion or discord. |
Description
This PR adds functionality to allow for upserting (update if exists, otherwise create) and hot-reloading (graceful shutdown of running pipeline) of yaml-config pipelines - no need to shut down the entire server anymore. This will enable the use of a file watcher to solve #1875
It also accepts a yaml string, rather than just a yaml file, which will allow for me to use yaml configs in the NATS api that I am working on (and you could add a similar endpoint for the grpc and http apis if you like), without any files at all.
I don't think this has any side effects on Conduit, and as far as I can tell it wont cause any of its own issues. However, I assume there's plenty of ways in which this could be improved - be it refactoring, allowing other existing functionality to leverage this, etc...
It does not add any api endpoints, schema etc for using this - i think that's beyond my pay grade ;)
Any feedback and guidance to improve this would be quite appreciated!
Quick checks