Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for 'LeaseRenewalThreshold' in vault agent #25212

Conversation

kevinschoonover
Copy link
Contributor

The consul-template vault stanza supports 'LeaseRenewalThreshold' parameter

  # The fraction of the lease duration of a non-renewable secret Consul
  # Template will wait for. This is used to calculate the sleep duration for
  # rechecking a Vault secret value. This field is optional and will default to
  # 90% of the lease time.
  lease_renewal_threshold = 0.90

which during my testing wasn't being applied the the internal vault-agent consul-template manager. This PR will pass the LeaseRenewalThreshold to consul-template.

@hghaf099 hghaf099 added the agent label Feb 5, 2024
@hghaf099
Copy link
Contributor

hghaf099 commented Feb 5, 2024

Thanks for your contribution. To better understand the requirement here, would you please elaborate on the usecase that this change supports? Note that the default value of the LeaseRenewalThreshold is being set at the config.Finalize call here. Having that added to the agent config as done here may require to add the related parts to the Vault parser as well so that it could be controllable by the operator. So, I am unclear as to what is the reason/usecase that this would be needed.

@kevinschoonover
Copy link
Contributor Author

Hey @hghaf099 !

I am planning to use this in conjunction with hashicorp/consul-template#1865 (and a follow up PR I will have after its merged) to rotate certificates earlier than 90% of the ttl. I have short lived certificates which last 7 days and the 90% interval means they are rotated with ~11 hours left. This leaves little time for me to alert or fix any issue that happen when a certificate expires.

Let me know if that helps or if I can provide more information.

ClientKey string `hcl:"client_key"`
TLSServerName string `hcl:"tls_server_name"`
Retry *Retry `hcl:"retry"`
LeaseRenewalThreshold float64 `hcl:"lease_renewal_threshold"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this going to be initialized? Please refer to my previous comment for this part. As is, it is never gonna be set. This Vault config is populated by either whatever is in the environment, or passed in the agent config or the command line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is this will be initialized when reading the below configuration and this is the only behavior I need for my use case (but can update the PR to allow it to be specified with the environment variables.

I did update the code to allow LeaseRenewalThreshold which corresponds to the consul-template behavior of using the default value when its nil so if its not specified it will default to .90

vault {
  address     = "https://vault.service.hetzner-hil.consul:8200"
  ca_cert     = "/etc/nomad/tls/nomad.ca.pem"
  client_cert = "/etc/nomad/tls/nomad.cert.pem"
  client_key  = "/etc/nomad/tls/nomad.key.pem"

  # Renew secrets at 60% of their lease duration 
  lease_renewal_threshold = 0.60
}

auto_auth {
  method "cert" {
    config = {
      ca_cert     = "/etc/nomad/tls/nomad.ca.pem"
      client_cert = "/etc/nomad/tls/nomad.cert.pem"
      client_key  = "/etc/nomad/tls/nomad.key.pem"
      reload      = true
    }
  }
}

listener "tcp" {
  address = "127.0.0.1:9202"
  tls_disable = true

  profiling {
    unauthenticated_pprof_access = true
    unauthenticated_in_flight_request_access = true
  }
}

cache {
  use_auto_auth_token = true
}

template_config {
  exit_on_retry_failure = true
}

telemetry {
  prometheus_retention_time = "24h"
  disable_hostname = true
  telemetry {
    unauthenticated_metrics_access = true
  }
}

...
template {
  source          = "/etc/vault-agent.d/nomad.cert.pem.tpl"
  destination     = "/etc/nomad/tls/nomad.dummy.consul-template"
  backup          = true
  # ../configure_vault_agent.yml#L8
  command         = "systemctl restart nomad && systemctl reload nomad-vault-agent"
  command_timeout = "95s"
  left_delimiter  = "<%"
  right_delimiter = "%>"

  perms = "0700"
}

@VioletHynes
Copy link
Contributor

VioletHynes commented Feb 23, 2024

Hey there @kevinschoonover ! Apologies this PR sat for a little bit. I'd love to merge this functionality, with a few caveats that I'd like to see before I approve and merge:

  • This config should be moved to the template_config stanza, since it only affects consul-template's rendering behaviour (this would avoid confusion about cache renewals)
  • I'd like to see at least config test showing that it's set properly
  • It'd be great if you could add docs, too. This would go in template.mdx (near the other template_config stanza bits) and might look something like this:
- `lease_renewal_threshold ` `(float: 0.9)` - The threshold fraction of the lease duration
 of a non-renewable secret that Vault Agent's template engine will wait for before re-rendering
the secret.

I'll try and watch this closely and keep my eye on it as it gets updates to hopefully shepherd it in, but feel free to tag me directly if I'm slow.

@VioletHynes
Copy link
Contributor

I've also kicked off the CI tests/builds/etc for the PR. Give me a shout if you have any questions!

@VioletHynes VioletHynes self-requested a review February 23, 2024 21:01
@kevinschoonover kevinschoonover force-pushed the kevinschoonover/vault-agent-renew-threshold branch from 9d23bb3 to 9ae7467 Compare February 24, 2024 08:03
@kevinschoonover kevinschoonover requested a review from a team as a code owner February 24, 2024 08:03
@kevinschoonover
Copy link
Contributor Author

kevinschoonover commented Feb 24, 2024

I've also kicked off the CI tests/builds/etc for the PR. Give me a shout if you have any questions!

Thanks for the comments @VioletHynes! The latest commit should address what you mentioned above.

@VioletHynes
Copy link
Contributor

Thanks a tonne @kevinschoonover ! I'm going to add a changelog, and give this another look over with the team but I'm hoping to merge this today.

@mladlow mladlow merged commit 19aeaa5 into hashicorp:main Feb 26, 2024
66 of 68 checks passed
@VioletHynes VioletHynes added this to the 1.16.0 milestone Mar 25, 2024
@kevinschoonover kevinschoonover deleted the kevinschoonover/vault-agent-renew-threshold branch April 7, 2024 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants