Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG][Logs Data Platform] dbaas logs input - resource busy #781

Open
pgillet opened this issue Dec 5, 2024 · 6 comments
Open

[BUG][Logs Data Platform] dbaas logs input - resource busy #781

pgillet opened this issue Dec 5, 2024 · 6 comments

Comments

@pgillet
Copy link

pgillet commented Dec 5, 2024

Describe the bug

When creating/updating a resource ovh_dbaas_logs_input, Terraform often complains that the resource is busy.

If it already exists, the dbaas log input must be first stopped by hand before trying again.

Terraform Version

1.8.5

OVH Terraform Provider Version

v1.0.0

Affected Resource(s)

  • ovh_dbaas_logs_input

Terraform Configuration Files

data "ovh_dbaas_logs_cluster" "logs_cluster" {
  service_name = "ldp-si-xxxxx"
}

data "ovh_dbaas_logs_input_engine" "logstash" {
  service_name = data.ovh_dbaas_logs_cluster.logs_cluster.service_name
  name         = "logstash"
  version      = "8.x"
}

data "ovh_dbaas_logs_cluster_retention" "P2W" {
  service_name = data.ovh_dbaas_logs_cluster.logs_cluster.service_name
  cluster_id   = data.ovh_dbaas_logs_cluster.logs_cluster.cluster_id
  duration     = "P2W"
}

resource "ovh_dbaas_logs_output_graylog_stream" "stream" {
  service_name = data.ovh_dbaas_logs_cluster.logs_cluster.service_name
  title        = "my stream"
  description  = "my graylog stream"
  indexing_enabled           = true
  indexing_max_size          = 500
  indexing_notify_enabled    = true
  pause_indexing_on_max_size = true
  web_socket_enabled         = true
  retention_id               = data.ovh_dbaas_logs_cluster_retention.P2W.retention_id
}

resource "ovh_dbaas_logs_input" "log_input" {
  service_name = ovh_dbaas_logs_output_graylog_stream.stream.service_name
  description  = ovh_dbaas_logs_output_graylog_stream.stream.description
  title        = "log-input"
  engine_id    = data.ovh_dbaas_logs_input_engine.logstash.id
  stream_id    = ovh_dbaas_logs_output_graylog_stream.stream.id

  allowed_networks = ["XX.XX.XX.XX/32", ...]
  exposed_port     = "6514"

  autoscale          = true
  min_scale_instance = 1
  max_scale_instance = 3

  configuration {
    logstash {
      input_section  = <<EOF
  tcp {
	port => 6514
	type => syslog
	ssl_enable => true
	ssl_verify => false
	ssl_cert => "/etc/ssl/private/server.crt"
	ssl_key => "/etc/ssl/private/server.key"
	ssl_extra_chain_certs => ["/etc/ssl/private/ca.crt"]
  }
  EOF
      filter_section = <<EOF
  grok {
	match => { "message" => "%%{SYSLOGBASE}" }
  }

  date {
      match => [ "timestamp", "MMM dd HH:mm:ss" ]
      target => "timestamp"
      timezone => "Europe/Paris"
  }
  EOF
    }
  }
}

Terraform Output

Error: error calling Put /dbaas/logs/ldp-si-xxxxx/input/xxx/end:
│ 	 "OVHcloud API error (status code 403): Client::Forbidden::Busy: \"An operation is already running for this service: xxx, xxx. Please retry later.\" (X-OVH-Query-Id: EU.ext-xxx)"
@amstuta
Copy link
Contributor

amstuta commented Dec 5, 2024

Hello @pgillet , thanks for opening this issue.

I see that your apply is failing on call PUT /dbaas/logs/ldp-si-xxxxx/input/xxx/end, that is called only when destroying the resource, so I guess that you ran a plan that replaces resource ovh_dbaas_logs_input.log_input ?

Could you share the modification that you are doing on this resource, so that I can try to reproduce ?

@pgillet
Copy link
Author

pgillet commented Dec 6, 2024

@amstuta OK, I have narrowed it down!

As a test, I re-apply my Terraform, knowing that the actual resources already exist and are well synchronized with my Terraform state.
But terraform plan is telling me that changes will be applied:

Terraform will perform the following actions:
  # ovh_dbaas_logs_input.log_input will be updated in-place
  ~ resource "ovh_dbaas_logs_input" "log_input" {
      ~ allowed_networks    = [
          - "XX.XX.XX.XX/32",
          - "YY.YY.YY.YY/32",
            "ZZ.ZZ.ZZ.ZZ/32",
          + "XX.XX.XX.XX/32",
            "ZZ.ZZ.ZZ.ZZ/32",
          + "YY.YY.YY.YY/32",
        ]
        id                  = "xxx"
        # (18 unchanged attributes hidden)
        # (1 unchanged block hidden)
    }

As you can see, the list of CIDRs in argument allowed_networks is not ordered in the same way before and after, and terraform considers that the resource must be updated.

And the terraform apply:

╷
│ Error: Error calling Put /dbaas/logs/ldp-si-xxxxx/input/xxx:
│ 	 "OVHcloud API error (status code 403): Client::Forbidden::Busy: \"An operation is already running for this service: yyy, zzz. Please retry later.\" (X-OVH-Query-Id: EU.ext-ttt)"
│ 
│   with ovh_dbaas_logs_input.log_input,
│   on logs.tf line 13, in resource "ovh_dbaas_logs_input" "log_input":
│   13: resource "ovh_dbaas_logs_input" "log_input" {
│ 
╵

And this last error happens because the actual Logstash instance is already up and running IMHO.

@pgillet
Copy link
Author

pgillet commented Dec 6, 2024

And thus, beware of other list arguments in the resource that can be unordered.

@amstuta
Copy link
Contributor

amstuta commented Dec 6, 2024

Ok, so if I understand correctly I guess we have two issue here:

  • allowed_networks is a type List and should be a Set instead, to avoid triggering an update. (And like you say, maybe apply the same fix to other lists of the resource)
  • When the update is triggered, we should stop the input before updating it to avoid the error An operation is already running for this service

@amstuta
Copy link
Contributor

amstuta commented Dec 6, 2024

we should stop the input before updating it

@pgillet I just confirmed that this isn't necessary, so this is not the root cause. Do you know if there is a any other operation running on the same LDP service at the same time ? If you could provide us the complete log when you encounter the error (with the request ID and the operation ID), it would help us understand what is happening at the same time.

@pgillet
Copy link
Author

pgillet commented Dec 6, 2024

I have other running log inputs (also Logstash 8.x) with each its log streams.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants