Skip to content

fix: handling infinite loop #812

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 19, 2024
Merged

Conversation

harshilgajera-crest
Copy link
Contributor

There was a bug in PSA where if in splunk_ingest_data first worker errors out or throws exception then it does not write to a file, now other workers would be waiting indefinitely for the file to be present, So the execution would continue in infinitly.

To handle this scenario, we can handle the exception thrown by the first worker and make sure that in any case it writes to a file, so that other workers don't wait indefinitly

@harshilgajera-crest harshilgajera-crest marked this pull request as ready for review April 18, 2024 07:22
@harshilgajera-crest harshilgajera-crest requested a review from a team as a code owner April 18, 2024 07:22
@mkolasinski-splunk
Copy link
Contributor

@harshilgajera-crest can you switch base to develop, so we first get beta release and run some testing in CI?

@harshilgajera-crest harshilgajera-crest changed the base branch from main to develop April 18, 2024 12:22
echo $SPLUNK_VERSION
docker compose -f "docker-compose-ci.yml" build
SPLUNK_PASSWORD=Chang3d! docker compose -f docker-compose-ci.yml up -d splunk
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we removing this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not actually removing this,
This line SPLUNK_PASSWORD=Chang3d! docker compose -f docker-compose-ci.yml up --abort-on-container-exit
would do the same thing, it would start all the services, earlier we did in individually which seems to be not required.


# Here we are not interested in the failures or errors,
# we are basically checking that we get results and test execution does not get stuck
assert result.parseoutcomes().get("passed") > 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to verify that the exception is raised by one of the workers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, because that is handled internally via xdist. But everything would run on either of the workers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about checking the logs if those are available somewhere?

My point is - if you don't make a fix - will this test pass anyways?

Copy link
Contributor Author

@harshilgajera-crest harshilgajera-crest Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@artemrys
Okay, So I actually tested it on main branch without making the fix and test kept on running.
Let me make the same in github actions, I will provide a link here for the actions run which should go on for 6hrs.

Here is the link with test change in main branch without fix: https://github.com/splunk/pytest-splunk-addon/actions/runs/8753677154/job/24023897918

This will go on forever till runner timeout as it doesnot have fix.

@harshilgajera-crest harshilgajera-crest changed the base branch from develop to main April 18, 2024 12:41
@harshilgajera-crest harshilgajera-crest changed the base branch from main to develop April 19, 2024 09:18
export SPLUNK_APP_PACKAGE=./tests/e2e/addons/TA_fiction
export SPLUNK_ADDON=TA_fiction
export SPLUNK_APP_ID=TA_fiction
export SPLUNK_APP_PACKAGE=./tests/e2e/addons/TA_fiction_indextime
Copy link
Contributor

@mkolasinski-splunk mkolasinski-splunk Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harshilgajera-crest
What is the reason of this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we were installing TA_fiction addon because our splunk docker file requires SPLUNK_APP_PACKAGE param. It was not necessary for the test as it just searched a query in _internal, but for the test that I added it requires this TA_fiction_indextime addon as we are ingesting its data(sc4s data).
Hence I updated to TA_fiction_indextime instead of TA_fiction.

@kdoroszko-splunk kdoroszko-splunk merged commit d3111a6 into develop Apr 19, 2024
@kdoroszko-splunk kdoroszko-splunk deleted the test/fix_infinite_loop branch April 19, 2024 14:43
@github-actions github-actions bot locked and limited conversation to collaborators Apr 19, 2024
@srv-rr-github-token
Copy link
Contributor

🎉 This PR is included in version 5.2.6-beta.1 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

@srv-rr-github-token
Copy link
Contributor

🎉 This PR is included in version 5.2.6 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants