Skip to content

[perf] MCAD takes a very long time to delete a large number of AppWrappers #477

Open
@kpouget

Description

@kpouget

As part of the MCAD load test, I created 1000 AppWrappers not fitting into the cluster (they request a high amount of CPU).
Once all of these AppWrappers are in one of these states: [Queueing, HeadOfLine, Pending, Failed], the main test ends, and the cleanup starts.

All the AppWrapper are deleted with oc delete AppWrappers --all -n <namespace>.
The timing of this call is shown in blue in the figure below.

Once this call returns, I create a canary AppWrapper, and wait for it to be executed.
This step is show in red in the figure below.
The Ansible logs of this command confirm that most of the 23 minutes is spent before the .status.controllerfirsttimestamp even gets filled.

image

All the details of the scale test are at this address (files here). Mind that there was a typo in the code (wrong file read as part of the visualizer parsing) which make the clean up phase appear as 5 minutes long (this was the test length :D).

This other plot (from this test) confirm that none of the 1000 AppWrappers created in the first 5 minutes of the test are discovered in the first 25 minutes of the test:
image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions