Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wip] Add logs for kubernetes events #6793

Closed
wants to merge 10 commits into from

Conversation

sk593
Copy link
Contributor

@sk593 sk593 commented Nov 16, 2023

Description

Adding logs to track pod failures during schedule func test runs

Type of change

  • This pull request fixes a bug in Radius and has an approved issue (issue link required).
  • This pull request adds or changes features of Radius and has an approved issue (issue link required).
  • This pull request is a minor refactor, code cleanup, test improvement, or other maintenance task and doesn't change the functionality of Radius (issue link optional).

Fixes: #issue_number

Auto-generated summary

🤖[deprecated] Generated by Copilot at d314bb0

Summary

🐽🛠

Enhance functional-test workflow to capture pod logs and events on failure. Add id to publish-tf-recipes step and two new steps to .github/workflows/functional-test.yaml.

When the recipes fail to publish
We unleash the wrath of pod logs
We capture the events of doom
And upload them to the cloud of gloom

Walkthrough

  • Add an id to the recipe publishing step to enable conditional execution of subsequent steps (link)

@sk593 sk593 requested review from a team as code owners November 16, 2023 22:21
@sk593 sk593 force-pushed the tf-recipe-logs branch 5 times, most recently from 7383a6f to ffbdb32 Compare November 16, 2023 22:34
@lakshmimsft
Copy link
Contributor

/ok-to-test

Copy link

github-actions bot commented Nov 16, 2023

Radius functional test overview

🔍 Go to test action run

Name Value
Repository sk593/radius
Commit ref ffbdb32
Unique ID 80ae6d82b7
Image tag pr-80ae6d82b7
Click here to see the list of tools in the current test run
  • gotestsum 1.10.0
  • KinD: v0.20.0
  • Dapr: 1.11.0
  • Azure KeyVault CSI driver: 1.4.2
  • Azure Workload identity webhook: 1.1.0
  • Bicep recipe location ghcr.io/radius-project/dev/test/functional/shared/recipes/<name>:pr-80ae6d82b7
  • Terraform recipe location http://tf-module-server.radius-test-tf-module-server.svc.cluster.local/<name>.zip (in cluster)
  • applications-rp test image location: ghcr.io/radius-project/dev/applications-rp:pr-80ae6d82b7
  • controller test image location: ghcr.io/radius-project/dev/controller:pr-80ae6d82b7
  • ucp test image location: ghcr.io/radius-project/dev/ucpd:pr-80ae6d82b7
  • deployment-engine test image location: ghcr.io/radius-project/deployment-engine:latest

Test Status

⌛ Building Radius and pushing container images for functional tests...
✅ Container images build succeeded
⌛ Publishing Bicep Recipes for functional tests...
✅ Recipe publishing succeeded
⌛ Starting shared functional tests...
⌛ Starting kubernetes functional tests...
⌛ Starting samples functional tests...
⌛ Starting datastoresrp functional tests...
⌛ Starting msgrp functional tests...
⌛ Starting daprrp functional tests...
⌛ Starting ucp functional tests...
✅ ucp functional tests succeeded
✅ kubernetes functional tests succeeded
✅ samples functional tests succeeded
✅ msgrp functional tests succeeded
✅ datastoresrp functional tests succeeded
✅ daprrp functional tests succeeded
✅ shared functional tests succeeded

@lakshmimsft
Copy link
Contributor

/ok-to-test

Copy link

github-actions bot commented Nov 16, 2023

Radius functional test overview

🔍 Go to test action run

Name Value
Repository sk593/radius
Commit ref c3cf9f6
Unique ID 2255ddb402
Image tag pr-2255ddb402
Click here to see the list of tools in the current test run
  • gotestsum 1.10.0
  • KinD: v0.20.0
  • Dapr: 1.11.0
  • Azure KeyVault CSI driver: 1.4.2
  • Azure Workload identity webhook: 1.1.0
  • Bicep recipe location ghcr.io/radius-project/dev/test/functional/shared/recipes/<name>:pr-2255ddb402
  • Terraform recipe location http://tf-module-server.radius-test-tf-module-server.svc.cluster.local/<name>.zip (in cluster)
  • applications-rp test image location: ghcr.io/radius-project/dev/applications-rp:pr-2255ddb402
  • controller test image location: ghcr.io/radius-project/dev/controller:pr-2255ddb402
  • ucp test image location: ghcr.io/radius-project/dev/ucpd:pr-2255ddb402
  • deployment-engine test image location: ghcr.io/radius-project/deployment-engine:latest

Test Status

⌛ Building Radius and pushing container images for functional tests...
✅ Container images build succeeded
⌛ Publishing Bicep Recipes for functional tests...
✅ Recipe publishing succeeded
⌛ Starting msgrp functional tests...
⌛ Starting ucp functional tests...
⌛ Starting datastoresrp functional tests...
⌛ Starting kubernetes functional tests...
⌛ Starting daprrp functional tests...
⌛ Starting samples functional tests...
⌛ Starting shared functional tests...
✅ ucp functional tests succeeded
✅ msgrp functional tests succeeded
✅ kubernetes functional tests succeeded
✅ daprrp functional tests succeeded
✅ datastoresrp functional tests succeeded
✅ shared functional tests succeeded

@@ -559,6 +560,30 @@ jobs:
--subscription ${{ secrets.INTEGRATION_TEST_SUBSCRIPTION_ID }} \
--name ${{ env.AZURE_TEST_RESOURCE_GROUP }} \
--yes --verbose
# TODO add once tested: if: failure() && steps.publish-tf-recipes.outcome == 'failure'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be bad if we just always included these logs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it wouldn't. We wanted to track recipe upload failures specifically, but we could just track the events whenever they're uploaded

label="app.kubernetes.io/name=tf-module-server"
pod_names=($(kubectl get pods -l $label -n $namespace -o jsonpath='{.items[*].metadata.name}'))
for pod_name in "${pod_names[@]}"; do
kubectl logs $pod_name -n $namespace > recipes/pod-logs/${pod_name}.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add a log here like echo "Pod ${pod_name} logs saved to recipes/pod-logs/"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a line below that logs when all of them are uploaded but will update to this

done
echo "Pod logs saved to recipes/pod-logs/"
# Get kubernetes events and save to file
kubectl get events -n $namespace > recipes/pod-logs/events.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And maybe another log after this line to state that the events are saved?

@sk593 sk593 force-pushed the tf-recipe-logs branch 3 times, most recently from 616f0fa to d0423f5 Compare November 28, 2023 21:35
@sk593 sk593 closed this Jan 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants