-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move flux-sched go bindings out of tree #1138
Comments
Where does this stand @vsoch? |
The first two bullets are done - we have (and use fluxion go) from here: https://github.com/flux-framework/fluxion-go. That also does automated releases with flux-sched here. The third bullet is not up to me to decide, but either way:
Pinging @milroy for his thoughts. At least as our current go projects stand, I'm using flux-sched for the .so builds, but the actual bindings I'm using fluxion-go. |
@trws can we talk about this at a future meeting? There are several improvements that are warranted for the go bindings, and I'd rather open issues / do work on one repository. For example:
The additional reason the above would be important is that I'd like to do the same for the queue in Go, and then have a fluxion-service, and extend that to a qmanager-service for Go. The reasons are two fold (aside from this being immensely fun and I'm learning a lot):
Anyhoo, let's chat about it! The second point is really important I think - I see that Flux has a future (a much larger footprint than it has now) in cloud, but we need to enable those developers to have handles to the different components. |
Edit: fixing formatting. Definitely we can discuss this. Also it sounds like the bindings are already functionally somewhere else and should probably be removed after a CI update to pull and test them on PR here. Some questions/responses below.
This should not be true. We have an "allocate_or_satisfiable" op for this, it's what both FCFS and backfill after reservation depth use. Is it missing from the reapi?
Not sure why this makes sense for C++. I'm guessing this is actually a side-effect of flattening the C++ interface to C then binding to go instead of defining a rich api in C++ and exposing it to go equivalently. C++ prefers an object just as much as go does, would love to do something about this.
Does defer fluxion.ctx.Destroy() not work or is this to confirm to a resource management API?
Ok, shrink we can definitely handle in fluxion. Grow is a bigger discussion, I'd be happy to see an issue or discussion on that separately, in fact I think there is one from a looooong time ago if you want to reference or resurrect it.
You mean you want to be able to call qmanager? I think that would be a good thing FWIW, and should be relatively easy to bind up if we do it right.
I agree. As a word of caution though, I don't want us to fall into the classic plugin API trap, where we expose so much, so freely, that we lose the ability to change anything (see Firefox pre-web-extension plugins). Still want to do it, but want to be clear about what is and what isn't stable or public API surface. |
We do have that (and that's what I'm using). I think I'd rather have something that is more along the lines of an enum so I can check one thing, and clearly, as opposed to two. E.g., I have to check that the job is not satisfiable by checking that allocated is false and satisfiable is false. At least for the exposed interface, it should be possible to have different states and then do something like: if status.State == fluxion.Unsatisfiable {} At least that reads much more nicely in the code, etc.
It does work! I just didn't know about it. I think it's mostly just the UI - I'd like to expose // Destroys properly closes (destroys) the fluxion client handle
func (fluxion *Fluxion) Close() {
fluxion.cli.Destroy()
} and then it is called directly when the grpc server exits err = server.Serve(lis)
if err != nil {
fmt.Printf("[GRPCServer] failed to serve: %v\n", err)
}
flux.Close()
No, that would be too much burden on the flux devs. I just want to take what I've built and move it external to Kubernetes so others can use it for experiments (myself included). I can do that entirely independently without asking for flux devs time.
Anything I develop I will take maintainership responsibility for, and (at least beyond the official Go API for fluxion, for example, which should be stable as it mirrors C++ fluxion) I don't plan to make any long term promises. |
And @trws one more question for you I was thinking about just now - does qmanager hold some memory of jobs out until eternity? I know I can easily see finished job stats, but I don't see that logic with qmanager, so I'm guessing that belongs somewhere with kvs / job manager? Apologies I haven't read the code yet - been doing nothing most of today and kind of digging that too :P Happy Sunday! |
Sorry I lost the thread on this, qmanager does not hold onto jobs persistently except in certain extreme circumstances where things go wrong enough that we need to know about it. The job-manager in core is what holds onto job data persistently through the KVS, so it gets persisted to sqlite and disk rather than holding everything in memory. |
As discussed in slack, we want to develop the fluxion Go bindings in a separate repository from fluxion. I'm opening an issue here for tracking and discussion outside of slack, where everything is eventually eaten by the slackbot monster. 👹 The reasons are the following (summarized from slack):
This needs to happen sooner than later, before there are lots of new Go projects using the bindings. The changes are fairly simple - all we really need is to expose a shared library for fluxion that go can use, and then hugely simplify the entire current project. The testing will be much easier, as will developing. The steps we will pursue are described in the task list below (wanted to try this out)!
Aside from task 1. I'll handle 2-3. And for the repository - I'll want to have admin permission to write/manage
flux-framework/fluxion-go
if one of the fearless leaders of flux could make that for me (ping @grondo @garlick @trws)! No rush, because we likely will need a few days for bullet 1, and of course I can do a lot of development without the repository!This should be useful and exciting work, and help us move forward with go projects that use fluxion more easily.
Tasks
The text was updated successfully, but these errors were encountered: