You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Rotating the client certificate/key used in a Consul secret backend for Vault does not seem possible when Vault itself is used to bootstrap the Consul ACL system. The /{mount}/config/access endpoint which is used to set the certificate/key will overwrite all fields, so when we send an updated cert without the bootstrap token, Vault assumes that we want to bootstrap the ACL system of the associated Consul cluster, which fails since Consul says No. We don't have the token because Vault swallows it when it issues the bootstrap command to Consul, and we shouldn't need to know it since Vault is in charge of the process and should have it stored somewhere.
This failure looks like:
vault_pki_secret_backend_cert.consul-dev-client-cert: Destroying... [id=internalca/vault-consul-dev/consul-dev-client.vault.dev.supercompany.com]
vault_pki_secret_backend_cert.consul-dev-client-cert: Destruction complete after 0s
vault_pki_secret_backend_cert.consul-dev-client-cert: Creating...
vault_pki_secret_backend_cert.consul-dev-client-cert: Creation complete after 1s [id=internalca/vault-consul-dev/consul-dev-client.vault.dev.supercompany.com]
vault_consul_secret_backend.consul-dev: Modifying... [id=consul-dev]
╷
│ Error: error configuring Consul configuration for "consul-dev": Error making API request.
│
│ URL: PUT https://vault.dev.supercompany.com:8200/v1/consul-dev/config/access
│ Code: 400. Errors:
│
│ * Token not provided and failed to bootstrap ACLs: Unexpected response code: 403 (Permission denied: rpc error making call: ACL bootstrap no longer allowed (reset index: 12345))
│
│ with vault_consul_secret_backend.consul-dev,
│ on consul-dev.tf line 32, in resource "vault_consul_secret_backend" "consul-dev":
│ 32: resource "vault_consul_secret_backend" "consul-dev" {
│
╵
The above did not actually result in a failure immediately, however. The old certificate was deleted, the new one was generated, but the consul secrets engine was not yet aware of the change. After the certificate actually expired, communication between Consul and Vault stopped. This failure looked like so:
Vault error occurred: Put "https://consul.dev.supercompany.com:8501/v1/acl/token": remote error: tls: bad certificate, on get https://vault.dev.supercompany.com:8200/v1/consul-dev/creds/consul-server
To Reproduce
Steps to reproduce the behavior:
Create docker network
a. docker network create --driver bridge bstok
Launch Vault container
a. docker run -dit --name vault.superveryreallydefinitelybogus.com --network bstok -p 8200:8200 --cap-add=IPC_LOCK -e 'VAULT_DEV_ROOT_TOKEN_ID=myroot' -e 'VAULT_DEV_LISTEN_ADDRESS=0.0.0.0:8200' hashicorp/vault
Create sample PKI infrastructure using Terraform
a. export VAULT_ADDR=http://localhost:8200
b. export VAULT_TOKEN=myroot
c. mkdir -p bstok/terraform bstok/docker/consul/config/tls
d.
Rotate client cert used with Vault<->Consul
a. terraform plan -replace vault_pki_secret_backend_cert.vault-consul -out=myplan
i. Should see '# vault_consul_secret_backend.consul will be updated in-place' with a new cert and key
Should see '# vault_pki_secret_backend_cert.vault-consul must be replaced'
b. terraform apply myplan
i. boom :(
% terraform apply myplan
vault_pki_secret_backend_cert.vault-consul: Destroying... [id=pki_intermediate/server_role/consul.superveryreallydefinitelybogus.com]
vault_pki_secret_backend_cert.vault-consul: Destruction complete after 0s
vault_pki_secret_backend_role.server_role: Modifying... [id=pki_intermediate/roles/server_role]
vault_pki_secret_backend_role.server_role: Modifications complete after 0s [id=pki_intermediate/roles/server_role]
vault_pki_secret_backend_cert.vault-consul: Creating...
vault_pki_secret_backend_cert.vault-consul: Creation complete after 3s [id=pki_intermediate/server_role/consul.superveryreallydefinitelybogus.com]
vault_consul_secret_backend.consul: Modifying... [id=consul]
╷
│ Error: error configuring Consul configuration for "consul": Error making API request.
│
│ URL: PUT http://localhost:8200/v1/consul/config/access
│ Code: 400. Errors:
│
│ * Token not provided and failed to bootstrap ACLs: Unexpected response code: 403 (Permission denied: ACL bootstrap no longer allowed (reset index: 23))
│
│ with vault_consul_secret_backend.consul,
│ on main.tf line 125, in resource "vault_consul_secret_backend" "consul":
│ 125: resource "vault_consul_secret_backend" "consul" {
Expected behavior
We expected just the new client certificate+key to be set for the secret backend.
Environment:
Vault Server Version: 1.14.1
Vault CLI Version: 1.14.1
Server Operating System/Architecture: Debian 11
Vault server configuration file(s): Can reproduce with bare dev container as shown above
Additional context
I have to believe that we are doing something incorrectly here, this seems like a big oversight?
Probable related report: #9056
The text was updated successfully, but these errors were encountered:
Hi there - I asked our resident Vault/Consul expert and she said that if you're in need of workarounds, this might help. In the meantime I'll bring this to our engineering teams as well. Thanks!
Thanks for the update. We kinda-sorta did something similar to recover. We turned off verify_incoming for the consul server nodes, requested a new global-management token from Vault, logged in to the UI with that token, dug around in the issued tokens until we happened to find the token that was initially issued for bootstrap, put that token at a path in Vault, and then use data.vault_generic_secret to pull that token from Vault for use within the vault_consul_secret_backend. Then we turned verify_incoming back on. However, I do like your idea better so I'll play with it in our lab.
Thanks for bringing this up to the engineering team as well as the Vault/Consul expert. I'll remain subscribed to this issue for future updates.
Describe the bug
Rotating the client certificate/key used in a Consul secret backend for Vault does not seem possible when Vault itself is used to bootstrap the Consul ACL system. The
/{mount}/config/access
endpoint which is used to set the certificate/key will overwrite all fields, so when we send an updated cert without the bootstrap token, Vault assumes that we want to bootstrap the ACL system of the associated Consul cluster, which fails since Consul says No. We don't have the token because Vault swallows it when it issues the bootstrap command to Consul, and we shouldn't need to know it since Vault is in charge of the process and should have it stored somewhere.This failure looks like:
The above did not actually result in a failure immediately, however. The old certificate was deleted, the new one was generated, but the consul secrets engine was not yet aware of the change. After the certificate actually expired, communication between Consul and Vault stopped. This failure looked like so:
To Reproduce
Steps to reproduce the behavior:
Create docker network
a.
docker network create --driver bridge bstok
Launch Vault container
a.
docker run -dit --name vault.superveryreallydefinitelybogus.com --network bstok -p 8200:8200 --cap-add=IPC_LOCK -e 'VAULT_DEV_ROOT_TOKEN_ID=myroot' -e 'VAULT_DEV_LISTEN_ADDRESS=0.0.0.0:8200' hashicorp/vault
Create sample PKI infrastructure using Terraform
a.
export VAULT_ADDR=http://localhost:8200
b.
export VAULT_TOKEN=myroot
c.
mkdir -p bstok/terraform bstok/docker/consul/config/tls
d.
e.
cd bstok/terraform
f.
terraform init
g.
terraform plan -out=myplan
h.
terraform apply myplan
Request certs for Consul
a.
cd ..
b.
vault write -format=json pki_intermediate/issue/server_role common_name="consul.superveryreallydefinitelybogus.com" uri_sans="server.dc1.consul" > certout.json
c.
cat certout.json | jq -r '.data.private_key' > docker/consul/config/tls/cert.key
d.
cat certout.json | jq -r '.data.ca_chain[0]' > docker/consul/config/tls/ca.crt
e.
cat certout.json | jq -r '.data.certificate' > docker/consul/config/tls/cert.crt
f.
cat docker/consul/config/tls/cert.crt docker/consul/config/tls/ca.crt > docker/consul/config/tls/certcombined.crt
Create small consul config file
a.
Launch consul container
a.
docker run -dit --name consul.superveryreallydefinitelybogus.com --network bstok -p 8300:8300 -p 8501:8501 -p 8600:8600/udp -v $(pwd)/docker/consul:/consul hashicorp/consul:1.15.10 agent -server -bootstrap -ui -client=0.0.0.0
Validate that Consul is running
a.
docker exec -it consul.superveryreallydefinitelybogus.com consul members
Uncomment Vault<->Consul backend creation in TF for Vault
a.
cd terraform
b.
sed -i.commented 's/^##//' main.tf
Create Vault<->Consul backend
a.
terraform plan -out=myplan
b.
terraform apply myplan
i. Should see from TF and Consul logs
a.
terraform plan -replace vault_pki_secret_backend_cert.vault-consul -out=myplan
i. Should see
'# vault_consul_secret_backend.consul will be updated in-place'
with a new cert and keyShould see
'# vault_pki_secret_backend_cert.vault-consul
must be replaced'b.
terraform apply myplan
i. boom :(
Expected behavior
We expected just the new client certificate+key to be set for the secret backend.
Environment:
Vault server configuration file(s): Can reproduce with bare dev container as shown above
Additional context
I have to believe that we are doing something incorrectly here, this seems like a big oversight?
Probable related report: #9056
The text was updated successfully, but these errors were encountered: