Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: split osd core test into ci groups #4784

Conversation

SuZhou-Joe
Copy link
Member

@SuZhou-Joe SuZhou-Joe commented Jun 17, 2024

Description

OSD core contains more than 50 specs, with hundreds of tests getting ran sequentially. While many side effects are performed across test cases, It is usually the case that core dashboards test cases are failed or flaky, and hard to troubleshoot. In order to maintain a more stable test env and more reproducible CI, this PR split sanity test cases of core dashboards into CI groups.

In order to make this work, opensearch-project/opensearch-dashboards-functional-test#1411 has to be merged in advance.

Issues Resolved

List any issues this PR will resolve, e.g. Closes [...].

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: SuZhou-Joe <suzhou@amazon.com>
@peterzhuamazon
Copy link
Member

Convert back to draft because this would not work.
Due to our Jenkinsfile is checking if the component exist.
It is comparing the name of the component with the one in build manifest.

https://github.com/opensearch-project/opensearch-build/blob/main/jenkins/opensearch-dashboards/integ-test.jenkinsfile#L188-L192

We still need more changes if we want to get this in.

Thanks.

@peterzhuamazon
Copy link
Member

Taking a look on the jenkinsfile changes.

@@ -5,7 +5,111 @@ ci:
image:
name: opensearchstaging/ci-runner:ci-runner-rockylinux8-opensearch-dashboards-integtest-v4
components:
- name: OpenSearch-Dashboards
- name: OpenSearch-Dashboards-ci-group-1
Copy link
Member

@gaiksaya gaiksaya Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would throw error saying component not found as there is a mismatch.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are good from the python side as it only checks test manifest and gets list of components. If we can handle it from Jenkinsfile side as mentioned in my comment below then we are good. Right?

@SuZhou-Joe
Copy link
Member Author

SuZhou-Joe commented Jun 18, 2024

@peterzhuamazon https://github.com/opensearch-project/opensearch-build/blob/main/jenkins/opensearch-dashboards/integ-test.jenkinsfile#L188-L192 I do not have enough context on the code in build repo but is this line necessary? What will happen if we remove these lines.

@rishabh6788
Copy link
Collaborator

@peterzhuamazon https://github.com/opensearch-project/opensearch-build/blob/main/jenkins/opensearch-dashboards/integ-test.jenkinsfile#L188-L192 I do not have enough context on the code in build repo but is this line necessary? What will happen if we remove these lines.

This ensures that for the component for which the integration test is being run is actually present in the distribution artifact that is being used to spin up the OSD process. In the present scenario this is helpful when a component is passed in COMPONENT_NAME field but it is not present in build manifest of distribution artifact selected for that run.

When nothing is passed this list is same as component list in build manifest.
In your case I believe you can add an OR condition to check if the component name is of OpenSearch-Dashboards-ci-group pattern then continue as it corresponds to OSD and OSD will always be present in a valid build manifest.

Please correct me if I'm wrong @peterzhuamazon

Signed-off-by: SuZhou-Joe <suzhou@amazon.com>
@SuZhou-Joe
Copy link
Member Author

@peterzhuamazon Update the jenkins file to get pass the check, how do you like the change?

logging.json: false
data.search.aggs.shardDelay.enabled: true
csp.warnLegacyBrowsers: false
- name: OpenSearch-Dashboards-ci-group-3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @SuZhou-Joe may I know what tests or components part of these groups ? example for OpenSearch-Dashboards-ci-group-3, coming from this PR https://github.com/opensearch-project/opensearch-dashboards-functional-test/pull/1411/files, how to know what tests or components are part of OpenSearch-Dashboards-ci-group-3?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@prudhvigodithi prudhvigodithi Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to get this change documentation somewhere (may be in https://github.com/opensearch-project/opensearch-build/wiki/Testing-the-Distribution).
Thanks

@peterzhuamazon
Copy link
Member

@peterzhuamazon Update the jenkins file to get pass the check, how do you like the change?

This will still having issues due to the lib is using name to find components when calling test.sh.

Please wait for a bit before we sort this out.

We are very close to the release date and lets just rush to get this change in.

Thanks.

@rishabh6788
Copy link
Collaborator

Dived a little deeper into the code and it is not as straight forward as it looks like. Simply adding OpenSearch-Dashboards-ci-group-* component in the test manifest will not work.
This is because here the passed component is verified against build manifest. If the component exists in build-manifest then the flow goes through, but since OpenSearch-Dashboards-ci-group-* is not a valid component the code will always throw an exception here.
Even though we try to fool it by adding a condition like invalid = [item for item in focus if item not in self and 'ci-group' not in item] it will error out here.

This is not a straight forward problem to solve and we may need to revisit how to handle ci-groups in test manifest, maybe add a new key to the schema.
Thoughts @gaiksaya @peterzhuamazon
CC: @getsaurabh02

@peterzhuamazon
Copy link
Member

Dived a little deeper into the code and it is not as straight forward as it looks like. Simply adding OpenSearch-Dashboards-ci-group-* component in the test manifest will not work. This is because here the passed component is verified against build manifest. If the component exists in build-manifest then the flow goes through, but since OpenSearch-Dashboards-ci-group-* is not a valid component the code will always throw an exception here. Even though we try to fool it by adding a condition like invalid = [item for item in focus if item not in self and 'ci-group' not in item] it will error out here.

This is not a straight forward problem to solve and we may need to revisit how to handle ci-groups in test manifest, maybe add a new key to the schema. Thoughts @gaiksaya @peterzhuamazon CC: @getsaurabh02

Agreed with @rishabh6788 , we will need to implement this properly, without rushing for a quick change here.
Let's discuss more on this topic.

Thanks.

@gaiksaya
Copy link
Member

Dived a little deeper into the code and it is not as straight forward as it looks like. Simply adding OpenSearch-Dashboards-ci-group-* component in the test manifest will not work. This is because here the passed component is verified against build manifest. If the component exists in build-manifest then the flow goes through, but since OpenSearch-Dashboards-ci-group-* is not a valid component the code will always throw an exception here. Even though we try to fool it by adding a condition like invalid = [item for item in focus if item not in self and 'ci-group' not in item] it will error out here.

This is not a straight forward problem to solve and we may need to revisit how to handle ci-groups in test manifest, maybe add a new key to the schema. Thoughts @gaiksaya @peterzhuamazon CC: @getsaurabh02

Yes that is what I meant by my comment here #4784 (comment)
One approach I can think of adding ci_group to the test manifest schema as an optional key. In this way you can add OpenSearch-Dashboards multiple times with different ci_group each time or just add range. But yeah needs to be thought through and not a straight forward implementation.

@ashwin-pc
Copy link
Member

@rishabh6788 @gaiksaya what are the next steps here?

@peterzhuamazon
Copy link
Member

peterzhuamazon commented Jun 22, 2024

Thanks @SuZhou-Joe ,

Close this in favor of this PR:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

6 participants