Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] master from armadaproject:master #126

Closed
wants to merge 29 commits into from

Conversation

pull[bot]
Copy link

@pull pull bot commented May 13, 2024

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

* wip

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* wip

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* lint

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* perform submit check

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* wip

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* first pass of reverting config move

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* second pass of reverting config move

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* fix merge errors

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* added ingester logic

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* tests

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* wip

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* wip

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* wip

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* merge master

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* remove submit check test

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* wip

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* wip

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* fix testfixtures

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* lint

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* wip

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* fix tests

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* better tests for submiitcheck

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* add  submit check test to scheduler_test

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* added option to skip submit check

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* lint

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* merge master

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* don't require

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* create job rejected

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* rename to validated

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

---------

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>
Co-authored-by: Chris Martin <chris@cmartinit.co.uk>
@pull pull bot added the ⤵️ pull label May 13, 2024
MustafaI and others added 28 commits May 13, 2024 17:00
* Update goreleaser version in build.yml

* Update goreleaser version in release.yml
Set All existing jobs to be validated
* [WIP] Enforce supplying queue + jobset on cancel and tidy up

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

* Fix unit tests

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

* Fix handling of cancel jobset

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

* Add tests

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

* Remove jobRepository field - added back in merge with master

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

---------

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>
* Improve correctness cancel + reprioritise by jobIds

Now only cancel/reprioritise jobs that belong to the queue/jobset of the request

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

* Formatting

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

* Formatting

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

* Fix formatting

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

---------

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>
* [WIP] Enforce supplying queue + jobset on cancel and tidy up

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

* Remove Redis job repository

This is no longer being used by any of the code

Currently it'll just expire JobDetails, which are no longer stored

Removing all this code is a step towards removing the need for this redis

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

* Fix unit tests

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

* Fix handling of cancel jobset

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

* Add tests

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

* Remove jobRepository field - added back in merge with master

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

* gofumpt

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

---------

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>
Signed-off-by: Chris Martin <chris@cmartinit.co.uk>
Co-authored-by: Chris Martin <chris@cmartinit.co.uk>
Signed-off-by: Chris Martin <chris@cmartinit.co.uk>
Co-authored-by: Chris Martin <chris@cmartinit.co.uk>
…actl (#3578)

* Require that queue and jobset are set when cancelling jobs using armadactl (#118)

* Require that queue and jobset are set when cancelling jobs using armadactl

* Update cmd/armadactl/cmd/cancel.go

Co-authored-by: James Murkin <James.Murkin@gresearch.co.uk>

---------

Co-authored-by: James Murkin <James.Murkin@gresearch.co.uk>

* lint

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

---------

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>
Co-authored-by: Christopher Martin <Chris.Martin@gresearch.co.uk>
Co-authored-by: James Murkin <James.Murkin@gresearch.co.uk>
Co-authored-by: Chris Martin <chris@cmartinit.co.uk>
* [Lookout] Store debug message in job_run table

This is the next step towards getting the debug messages displayed in lookout

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

* Imports

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

---------

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>
* use proper database table

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* implement cleanup

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* disallow long client ids

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* improved logging

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* lint

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

---------

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>
Co-authored-by: Chris Martin <chris@cmartinit.co.uk>
Signed-off-by: Chris Martin <chris@cmartinit.co.uk>
Co-authored-by: Chris Martin <chris@cmartinit.co.uk>
* Move auth

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* lint

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

---------

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>
Co-authored-by: Chris Martin <chris@cmartinit.co.uk>
* wip

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* remove pulsartest goreleaser config

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* remove pulsar_test

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* go mod tidy

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* remove pulsartest cmd

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* remove pulsartest package

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* remove e2e

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

---------

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>
Co-authored-by: Chris Martin <chris@cmartinit.co.uk>
* remove e2e

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* remove e2e

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* log out update frequency

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* log number of executors loaded

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* correct vars

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* correct vars

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* force to be 1s

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* properly set env vars

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* properly set env vars

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

* revert README

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>

---------

Signed-off-by: Chris Martin <chris@cmartinit.co.uk>
Co-authored-by: Chris Martin <chris@cmartinit.co.uk>
Currently the pruner will blindly delete any jobs that are older than the expiry value

There are 2 issues with this:
 - Jobs that run longer than the configured expiry time, will disappear from Lookout before they finish running. Making it harder to work out what is going on in the system
 - We have to configure the expiry to be longer than pretty much any job will ever run for. Meaning we have to retain a lot of terminal jobs - causing Lookout performance to be work than it could be

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>
* With grpc stream response, in case of connection drops - instead of getting an error - we stop getting events. This PR sets connection in python client such, that client will transparently attempt to reconnect in case no messages were received in last 15 minutes (timeout is configurable).
* queue and job_set_id - are both required parameters for cancel / reprioritize requests (in line to actual go API).
In future, we will:
* Add heartbeat messages from Armada API at a later stage, which would allow us to use a tighter timeout in the future (currently, we indirectly depend on Utilisation event messages, sent every 5 minutes as a proxy for connection heartbeat).
* Reduce Redis dependency (for non-event stream use-cases)
Fixes:
* handles armada event stream disconnections with configurable retry timeouts.
Breaking Changes:
* job_set_id and queue are now mandatory for majority of API calls (in line with latest armada-api version requirements).
Deprecates:
* deprecates cancelling all jobs in a JobSet via client.cancel_jobs method (use client.cancel_jobset instead).
* Use read-only (default) database connection for Lookout
* Removing deprecated commands, making armadaUrl flag work as expected

* Converting armadactl flags to kebab-case

* Cleaning up CLI, cleaning up CLI docs, moving reporting commands, separating out reprioritize commands

* Moving resources to their own directory

* Removing capitalisation

* This is an empty commit in order to trigger PR check re-run

* This is an empty commit in order to trigger PR check re-run

* Modifying cancel, preempt to use verb-noun structure. Modified cancel jobSet to use correct Armada API endpoint.

* Making reprioritize command require queue and job-set

* Updating documentation

* Tidying go.mod file, linting

Co-authored-by: Mustafa Ilyas <Mustafa.Ilyas@gresearch.co.uk>
* Update python client to include preempt_jobs

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

* Support ssl in example

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

* Formatting

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

* Fix asyncio client

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>

* regenerated docs

---------

Signed-off-by: JamesMurkin <jamesmurkin@hotmail.com>
Co-authored-by: Martynas Asipauskas <me@martynas.co.uk>
New Features:
- Add support for preempting jobs `client.preempt_jobs`
* Adding migration to drop annotations table and unused indexes on the Jobs table

* Removing user_annotation_lookup from lookout pruner

---------

Co-authored-by: Mustafa <mustafa.ilyas@gresearch.co.uk>
@warmchang warmchang closed this May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants