Description
Bug description 🐞
We are doing a POC for Terrakube and trying to evaluidate if its a good option for out use case. We have installed the version 2.25.0 using helm chart. we are using AWS dynamic credentials. And everything works fine and we are able to run some sample jobs. However randomly some jobs get stuck in queue stage without any error on UI and there is no option to terminate the job from UI as well. I see below error in API pod for that job run. So looks like some issue with API call. We have to go to database and manually mark it as failed and only then we can process with next jobs. This happens once for every 10-15 job runs and I am not able to pin point the reason.
[threadPoolTaskExecutor-1]` INFO org.terrakube.executor.service.workspace.SetupWorkspaceImpl - Generating AWS dynamic credentials files inside the workspace execution
[threadPoolTaskExecutor-1] INFO org.terrakube.executor.service.workspace.SetupWorkspaceImpl - Writing AWS credentials to /home/cnb/.terraform-spring-boot/executor/d9b58bd3-f3fc-4056-a026-1163297e80a8/03a581cb-9f80-46c0-ba77-48c27abaa0bb/terrakube_config_dynamic_credentials_aws.txt
[threadPoolTaskExecutor-1] INFO org.terrakube.executor.service.status.UpdateJobStatusImpl - Step list is not empty...
[threadPoolTaskExecutor-1] ERROR org.springframework.aop.interceptor.SimpleAsyncUncaughtExceptionHandler - Unexpected exception occurred invoking async method: public void org.terrakube.executor.service.executor.ExecutorJobImpl.createJob(org.terrakube.executor.service.mode.TerraformJob)
feign.FeignException$FeignClientException: [423 ] during [PATCH] to [http://terrakube-api-service:8080/api/v1/organization/d9b58bd3-f3fc-4056-a026-1163297e80a8/job/60] [TerrakubeClient#updateJob(JobRequest,String,String)]: [{"errors":[{"detail":"ERROR: null value in column "job_id" of relation "step" violates not-null constraint\n Detail: Failing row contains (436c9b25-5317-4cc7-af78-0b54a2dc5480, 150, null, pending, null, Approve Plan from Terraform CLI, null)."}]}]
at feign.FeignException.clientErrorStatus(FeignException.java:244)
at feign.FeignException.errorStatus(FeignException.java:203)
at feign.FeignException.errorStatus(FeignException.java:194)
at feign.codec.ErrorDecoder$Default.decode(ErrorDecoder.java:103)
at feign.InvocationContext.decodeError(InvocationContext.java:126)
at feign.InvocationContext.proceed(InvocationContext.java:72)
at feign.ResponseHandler.handleResponse(ResponseHandler.java:63)
at feign.SynchronousMethodHandler.executeAndDecode(SynchronousMethodHandler.java:114)
at feign.SynchronousMethodHandler.invoke(SynchronousMethodHandler.java:70)
at feign.ReflectiveFeign$FeignInvocationHandler.invoke(ReflectiveFeign.java:99)
at jdk.proxy2/jdk.proxy2.$Proxy93.updateJob(Unknown Source)
at org.terrakube.executor.service.status.UpdateJobStatusImpl.setRunningStatus(UpdateJobStatusImpl.java:54)
at org.terrakube.executor.service.executor.ExecutorJobImpl.createJob(ExecutorJobImpl.java:47)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.base/java.lang.reflect.Method.invoke(Unknown Source)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:355)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:196)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at org.springframework.aop.interceptor.AsyncExecutionInterceptor.lambda$invoke$0(AsyncExecutionInterceptor.java:113)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
Loading...
[You're using Lens Personal (for individuals or companies with < $10M annual revenue or funding)](https://k8slens.dev/pricing)
Steps to reproduce
Run terraform below terraform with CLI multiple times.
terraform {
backend "remote" {
organization = "simple"
hostname = "xxxxx"
workspaces {
name = "xxxx"
}
}
}
provider "aws" {
assume_role {
role_arn = "arn:aws:iam::xxxxxx:role/terrakube-cross-account"
}
}
# provider "aws" {
# alias = "dev"
# assume_role {
# role_arn = "arn:aws:iam::xxxxxx:role/terrakube-cross-account"
# }
# }
resource "aws_s3_bucket" "example" {
bucket = "my-tf-xxxx-awerqerqwcc-xxxxxx"
#provider = aws.dev
tags = {
Name = "My bucket"
Environment = "Dev"
}
}
Expected behavior
Job should run or give some error. But its gets stuck in queue stage
Example repository
No response
Anything else?
No response