Skip to content

ECK TestFleet* is failing #7790

Open
Open

Description

https://buildkite.com/elastic/cloud-on-k8s-operator-nightly/builds/530:

I had a closer look at TestFleetMode/Fleet_in_same_namespace_as_Agent:

  • 2024-04-29T23:03:21.220427957Z - TestFleetMode/Fleet_in_same_namespace_as_Agent/ES_data_should_pass_validations starts
  • 2024-04-29T23:18:21.222389355Z - Failure with {Status:404 Error:{CausedBy:{Reason: Type:} Reason:no such index [logs-elastic_agent-default]

Agents

The 3 Agents have the same exact same final log lines:

{
    "log.level": "error",
    "@timestamp": "2024-04-29T23:10:26.530Z",
    "log.origin": {
        "file.name": "fleet/fleet_gateway.go",
        "file.line": 206
    },
    "message": "Could not communicate with fleet-server Checking API will retry, error: fail to checkin to fleet-server: Post \"https://test-agent-fleet-fs-w6xh-agent-http.e2e-fo9kn-mercury.svc:8220/api/fleet/agents/e94192f6-77db-431e-929e-98a11f267d6b/checkin?\": dial tcp 10.56.235.142:8220: connect: connection refused",
    "ecs.version": "1.6.0"
}
{
    "log.level": "error",
    "@timestamp": "2024-04-29T23:15:23.323Z",
    "log.origin": {
        "file.name": "fleet/fleet_gateway.go",
        "file.line": 206
    },
    "message": "Could not communicate with fleet-server Checking API will retry, error: fail to checkin to fleet-server: Post \"https://test-agent-fleet-fs-w6xh-agent-http.e2e-fo9kn-mercury.svc:8220/api/fleet/agents/bc83885c-c297-44a9-95db-b5471fba15e8/checkin?\": dial tcp 10.56.235.142:8220: connect: connection refused",
    "ecs.version": "1.6.0"
}
{
    "log.level": "error",
    "@timestamp": "2024-04-29T23:16:30.150Z",
    "log.origin": {
        "file.name": "fleet/fleet_gateway.go",
        "file.line": 206
    },
    "message": "Could not communicate with fleet-server Checking API will retry, error: fail to checkin to fleet-server: Post \"https://test-agent-fleet-fs-w6xh-agent-http.e2e-fo9kn-mercury.svc:8220/api/fleet/agents/4509833f-ba40-41d1-b164-fd29009b01ab/checkin?\": dial tcp 10.56.235.142:8220: connect: connection refused",
    "ecs.version": "1.6.0"
}

Fleet Server

Fleet server seems to restart quite often, last start is 13 minutes after the Pod has been created:

{
    "apiVersion": "v1",
    "kind": "Pod",
    "metadata": {
        "creationTimestamp": "2024-04-29T23:03:16Z",
        "name": "test-agent-fleet-fs-w6xh-agent-6bdfbbdc7f-m5dpk",
        "namespace": "e2e-fo9kn-mercury",
    }
    "status": {
        "containerStatuses": [
            {
                "image": "docker.elastic.co/beats/elastic-agent:8.1.3",
                "lastState": {
                    "terminated": {
                        "exitCode": 1,
                        "finishedAt": "2024-04-29T23:15:23Z",
                        "reason": "Error",
                        "startedAt": "2024-04-29T23:13:16Z"
                    }
                },
                "name": "agent",
                "ready": true,
                "restartCount": 5, ### WHY ?
                "started": true,
                "state": {
                    "running": {
                        "startedAt": "2024-04-29T23:16:53Z" ## ~13 minutes after the Pod has been created
                    }
                }
            }
        ],

Looking at the logs of a previous container instance:

Apr 30, 2024 @ 23:10:24.273 Updating certificates in /etc/ssl/certs...
....
Apr 30, 2024 @ 23:10:36.085 2024-04-29T23:10:36Z - message: Application: fleet-server--8.1.3[]: State changed to STARTING: Waiting on policy with Fleet Server integration: eck-fleet-server - type: 'STATE' - sub_type: 'STARTING'
Apr 30, 2024 @ 23:10:36.876 Fleet Server - Waiting on policy with Fleet Server integration: eck-fleet-server
Apr 30, 2024 @ 23:12:29.794 Shutting down Elastic Agent and sending last events...
Apr 30, 2024 @ 23:12:29.794 waiting for installer of pipeline 'default' to finish
Apr 30, 2024 @ 23:12:29.795 Signaling application to stop because of shutdown: fleet-server--8.1.3
Apr 30, 2024 @ 23:12:31.297 2024-04-29T23:12:31Z - message: Application: fleet-server--8.1.3[]: State changed to STOPPED: Stopped - type: 'STATE' - sub_type: 'STOPPED'
Apr 30, 2024 @ 23:12:31.297 Shutting down completed.
Apr 30, 2024 @ 23:12:31.301 Error: fleet-server failed: context canceled
Apr 30, 2024 @ 23:12:31.304 Error: enrollment failed: exit status 1

I think the question is what is the root cause of this fleet-server failed: context canceled?

Kibana

On the Kibana side the container has been oomkilled once but seems to be ready way before Agent as expected:

{
    "containerID": "containerd://2abe4cde65635722a5fdc84cf3502b33c3567fbffdea8d58f2e8f4079a49e56b",
    "image": "docker.elastic.co/kibana/kibana:8.1.3",
    "imageID": "docker.elastic.co/kibana/kibana@sha256:1d7cd5fa140df3307f181830b096562c9f2fc565c94f6f9330aa2313ecb7595c",
    "lastState": {
        "terminated": {
            "containerID": "containerd://e14eb76e157c85b18a44d60ea21ffdf0fb475ee54a9e0bedf6843e2381ed4ed7",
            "exitCode": 137,
            "finishedAt": "2024-04-29T23:00:39Z",
            "reason": "OOMKilled",
            "startedAt": "2024-04-29T23:00:00Z"
        }
    },
    "name": "kibana",
    "ready": true,
    "restartCount": 1,
    "started": true,
    "state": {
        "running": {
            "startedAt": "2024-04-29T23:00:39Z"
        }
    }
}

In the Kibana logs we have a few 401 from 23:02:39.664+00:00 to 23:02:45.162+00:00, which might be related to the operator trying to call https://test-agent-fleet-qj5h-kb-http.e2e-fo9kn-mercury.svc:5601/api/fleet/setup (last 401. in the operator is at 23:02:45.193Z). Seems to be ok otherwise.

[2024-04-29T23:01:00.615+00:00][INFO ][status] Kibana is now degraded
[2024-04-29T23:01:03.360+00:00][INFO ][status] Kibana is now available (was degraded)
[2024-04-29T23:01:03.388+00:00][INFO ][plugins.reporting.store] Creating ILM policy for managing reporting indices: kibana-reporting
[2024-04-29T23:01:03.543+00:00][INFO ][plugins.fleet] Encountered non fatal errors during Fleet setup
[2024-04-29T23:01:03.544+00:00][INFO ][plugins.fleet] {"name":"Error","message":"Saved object [epm-packages-assets/a01d2162-e711-5ee9-88c5-62781411ac1e] not found"}
[2024-04-29T23:01:03.544+00:00][INFO ][plugins.fleet] {"name":"Error","message":"Saved object [epm-packages-assets/a01d2162-e711-5ee9-88c5-62781411ac1e] not found"}
[2024-04-29T23:01:03.544+00:00][INFO ][plugins.fleet] Fleet setup completed
[2024-04-29T23:01:03.552+00:00][INFO ][plugins.securitySolution] Dependent plugin setup complete - Starting ManifestTask
[2024-04-29T23:02:39.664+00:00][INFO ][plugins.security.authentication] Authentication attempt failed: {"error":{"root_cause":[{"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}}],"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}},"status":401}
[2024-04-29T23:02:39.749+00:00][INFO ][plugins.security.authentication] Authentication attempt failed: {"error":{"root_cause":[{"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}}],"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}},"status":401}
[2024-04-29T23:02:39.916+00:00][INFO ][plugins.security.authentication] Authentication attempt failed: {"error":{"root_cause":[{"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}}],"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}},"status":401}
[2024-04-29T23:02:39.967+00:00][INFO ][plugins.security.authentication] Authentication attempt failed: {"error":{"root_cause":[{"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}}],"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}},"status":401}
[2024-04-29T23:02:40.019+00:00][INFO ][plugins.security.authentication] Authentication attempt failed: {"error":{"root_cause":[{"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}}],"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}},"status":401}
[2024-04-29T23:02:40.070+00:00][INFO ][plugins.security.authentication] Authentication attempt failed: {"error":{"root_cause":[{"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}}],"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}},"status":401}
[2024-04-29T23:02:40.150+00:00][INFO ][plugins.security.authentication] Authentication attempt failed: {"error":{"root_cause":[{"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}}],"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}},"status":401}
[2024-04-29T23:02:40.525+00:00][INFO ][plugins.security.authentication] Authentication attempt failed: {"error":{"root_cause":[{"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}}],"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}},"status":401}
[2024-04-29T23:02:41.214+00:00][INFO ][plugins.security.authentication] Authentication attempt failed: {"error":{"root_cause":[{"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}}],"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}},"status":401}
[2024-04-29T23:02:42.547+00:00][INFO ][plugins.security.authentication] Authentication attempt failed: {"error":{"root_cause":[{"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}}],"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}},"status":401}
[2024-04-29T23:02:45.162+00:00][INFO ][plugins.security.authentication] Authentication attempt failed: {"error":{"root_cause":[{"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}}],"type":"security_exception","reason":"unable to authenticate user [e2e-fo9kn-mercury-test-agent-fleet-fs-w6xh-agent-kb-user] for REST request [/_security/_authenticate]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}},"status":401}
[2024-04-29T23:02:50.589+00:00][INFO ][plugins.fleet] Beginning fleet setup
[2024-04-29T23:03:16.505+00:00][INFO ][plugins.fleet] Fleet setup completed
[2024-04-29T23:03:17.486+00:00][INFO ][plugins.fleet] Beginning fleet setup
[2024-04-29T23:03:17.645+00:00][INFO ][plugins.fleet] Fleet setup completed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    >testRelated to unit/integration/e2e tests

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions