-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
e2e: refactor metrics test to use NSD and WI #19022
Conversation
1d8b813
to
a00df18
Compare
a00df18
to
39f2033
Compare
Spot check against e2e
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! It seems like a nice mix of task drivers, would make sense to use Docker for one of the Podman task to add even more flavours?
http:// { | ||
{{ $allocID := env "NOMAD_ALLOC_ID" -}} | ||
{{ range nomadService 1 $allocID "prometheus" }} | ||
reverse_proxy {{ .Address }}:{{ .Port }} | ||
{{ end }} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
http:// { | |
{{ $allocID := env "NOMAD_ALLOC_ID" -}} | |
{{ range nomadService 1 $allocID "prometheus" }} | |
reverse_proxy {{ .Address }}:{{ .Port }} | |
{{ end }} | |
} | |
http:// { | |
reverse_proxy {{ range nomadService 1 $allocID "prometheus" }}{{ .Address }}:{{ .Port }} {{end}} | |
} |
Not that it matters much in this case, but we could let Caddy handle the load balancing. I'm also not too familar with Caddyfiles, so I hope this is correct 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave this a try and it failed ... probably not going to spend the time to investigate, so many other test failures to look into 😓
e2e/metrics/input/cpustress.hcl
Outdated
|
||
config { | ||
command = "stress" | ||
args = ["--cpu", "1", ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
args = ["--cpu", "1", ] | |
args = ["--cpu", "1"] |
} | ||
|
||
task "cpustress" { | ||
driver = "pledge" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
# run a private holepunch instance in this group network | ||
# so prometheus can access the nomad api for service disco |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice x2
Yeah for sure, we can look into expanding this especially once we get the windows client back |
This PR overhauls the metrics e2e suite which has been failing for like a year. It swaps out the use of Consul in favor of Nomad native service discovery and workload identity.
Basically it runs a handful of random jobs which produce metrics, then uses Prometheus to gather those metrics. Caddy is used to expose the Prometheus API to be reachable from the test runner. The little nomad-holepunch thing is used as a side car to enable Prometheus to access the Nomad API using workload identity. It's also used as a service job to represent each Nomad client in the Nomad service registry.