-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose feature flags for collectors, diagnosers, exporters #52
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Thank you David, I have added some basic suggestions/comments, configurability sounds good, it will have some impact on existing 2 PR. Mainly the comments is for code movement and possibly adding more comments.
Most important will be local test to make sure this works from the image. (Checklist covers that point)
Lets make sure we did these checks from checklist to make sure you have covered existing behaviours are: (before we close this PR)
✅ Make sure you locally run CI.
✅ Tip to locally test changes via image reside here: https://github.com/Azure/aks-periscope#programming-guide
* Above will be a very important scenario to test, and feel free to reach out to us for anything.
✅ If you find any new functionality could be easily extracted to UT (unit test) then feel enabled to do so.
✅ Once we have done all the above, we should have enough confidence to open PR for review.
(At least these are my initial thoughts) I will add more in case I forget anything.
Really appreciate it. Other folks will kindly add more. 🙏 cc: @JunSun17 + @arnaud-tincelin
- support multiple exporters - refactor main() to better encapsulate logic in sub-funcs - add multierror package, currently only used when collector.export errors; to prevent one failing exporter preventing the others from running - removed zip and export mode flag for now
Hi @Tatsinnit this is done and ready to merge IMO - sorry for rushing you but would like to get this into master ASAP so our partners in OSM and Arc team can start using some of the new functionality. Manually tested against my single node test AKS cluster which has gitops and some other stuff on it. In case you want to do manual testing yourself, build is pushed to my public acr here: SAS Url for the outputs (earlier run 03-06 is from the new build, later run 03-24 is from current master): https://dakydddevauseast2.blob.core.windows.net/dakydd-test-eastus-dns-d0daedb9?sv=2019-12-12&st=2021-04-07T03%3A14%3A00Z&se=2022-05-08T03%3A14%3A00Z&sr=c&sp=rl&sig=3x%2FL0r%2FCXCf7YMunfiKTSIk5GlTasG9GAc0LLRI10i0%3D Logs from the periscope pod are as attached, "dakydd" is the new build (can tell from slightly different log lines at end): |
Hiya, @davidkydd , I can take a look in few hours, as far as you have made sure to go through this check list #52 (review) (And things look good, I would take a quick peek and if nothing bad we can definitely go to master)
I will take a look and approve this. 🙏 thank you! |
Hiya, @davidkyd, thanks also I see/noticed you have made 19 files aka some more changes until which was in last review so please wait for quick eyes on this. cc: @JunSun17 , @arnaud-tincelin and just in case if @Porges for eyes. Thanks. |
@Tatsinnit yes I have been through the checklist, changes are mostly just refactoring. Yes I have manually tested by deploying to a test cluster - outputs of the test run vs. master are in the comment I posted above. Confident there are no regressions. |
also note merging to master won't actually deploy or do anything other than update the code |
Cool, @davidkydd sounds good, I will take a good look in few hours, thanks, cheers for making sure and testing it. Feel free to add anything else in description of this PR, which will help me to understand any specific “exporter” change. 🙏 |
Sounds good :) - ping me later on teams when you are around and I can try answer any outstanding Qs. Exporter change is to support multiple exporters for a collector, simply iterates through them and calls export on each as it does currently for the single exporter, collecting any errors in a multierror as it iterates and then returning them. Also lets exporters be configured (enabled / disabled) via the yaml config. Otherwise its just refactoring of the main func. Should also mention I tested the version of the yaml which will end up in master runs fine - this includes the new configMaps but targets the current v0.3 version of the containerImage. Bit backwards as strictly we should push the new containerImage first and include the new image tag in the same commit as the other changes to the config, but this way makes things a bit easier for partners. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 looks good!! Thanks. it looks close to approve but just noticed 2 things, which need your eyes: I've spend last hour and half to identify the change and test in various way, but seems like with image: dakyddacr.azurecr.io/periscope/featureflag:v4
.
I manually tested and also tested from local file hooked to vscode.
Since the impact of this will be multiple tools which consumes this I have added all the test what I have done below. like vscode and cli.
2 things are:
-
New changes in Yaml will have delete failure or config-maps: Delete -f *.yaml - in vscode we use the namesake delete so its safe, and in cli we use that as well, I wonder do we have any other tool where this will enable exitCode == non-zero.
-
For some reason, when I clear the namespace using
kubectl delete -f aks-periscope
I get stuck every time toTeminating
state and I had to do manual termination. (by using the kubectl proxy with json data)
Flagging it to be sure that you don't see this behaviour at your end. screenshot below for the error.
Delete of Kubectl will fail with error of new config-maps as well:
Only at one occasion I was able to get this error but rest of the time it worked so I believe it must be the azure-sdk or storage.
2021/05/07 08:00:39 Collector: containerlogs, export data
2021/05/07 08:00:40 Collector: containerlogs, export data failed: 1 error occurred:
* Failed to create blob with storage error: -> github.com/Azure/azure-storage-blob-go/azblob.newStorageError, /app/vendor/github.com/Azure/azure-storage-blob-go/azblob/zc_storage_error.go:42
Possible Solution: (to move forward)
- If you are happy/agree/ready for urgent revert in case we found any depending tool failure then we can
squash merge
this as long as you are happy.
Otherwise, I would recommend exploring the above 2 errors and make sure they will not hinder any tool etc. Open for ideas.
Thanks heaps. 🙏
Many thanks @Tatsinnit - yeah I admit the errors from additional configmaps on delete are a bit ugly - vs code runs "delete ns aks-periscope" and CLI has --ignore-not-found set so shouldn't show up in either - perhaps older versions might not handle though..? Haven't seen either the storage error or delete hanging in Terminating you mention but they do give me pause - suggest holding off for now then until we can try repro. Thanks again bud! |
Anytime @davidkydd, no worries and sounds good: pause it is. 👍 thanks. Sounds worth to just double check in another machine for sure. In case you have same issue, I did this manual work to terminate the namespace :
thanks 🙏 |
# This is the 1st commit message: Revert "introduce arcmode" This reverts commit 5f4fed4. # This is the commit message #2: remove secrets # This is the commit message Azure#3: add print statement # This is the commit message Azure#4: update print statement # This is the commit message Azure#5: committed # This is the commit message Azure#6: committed # This is the commit message Azure#7: committed # This is the commit message Azure#8: committed # This is the commit message Azure#9: remove print statements # This is the commit message Azure#10: add helm collector # This is the commit message Azure#11: change helm command # This is the commit message Azure#12: add helm 3 installation # This is the commit message Azure#13: add curl command installation # This is the commit message Azure#14: change helm command # This is the commit message Azure#15: remove helm history # This is the commit message Azure#16: debug helm history # This is the commit message Azure#17: add repo command # This is the commit message Azure#18: change stable repo name # This is the commit message Azure#19: add write to file # This is the commit message Azure#20: add kured # This is the commit message Azure#21: change # This is the commit message Azure#22: changes # This is the commit message Azure#23: add default namespace # This is the commit message Azure#24: change # This is the commit message Azure#25: add integration test # This is the commit message Azure#26: changes # This is the commit message Azure#27: add helm test # This is the commit message Azure#28: change print statement to error # This is the commit message Azure#29: change # This is the commit message Azure#30: more changes # This is the commit message Azure#31: add go installation # This is the commit message Azure#32: fix unit test # This is the commit message Azure#33: iptables to Helm # This is the commit message Azure#34: add custom resource collector # This is the commit message Azure#35: add new exporter, diagnoser, collector # This is the commit message Azure#36: comment unused variables # This is the commit message Azure#37: debug exporter # This is the commit message Azure#38: filenames # This is the commit message Azure#39: test zip function # This is the commit message Azure#40: list files # This is the commit message Azure#41: fmt to log # This is the commit message Azure#42: delete lines # This is the commit message Azure#43: changed # This is the commit message Azure#44: get current directory # This is the commit message Azure#45: remove some print statements # This is the commit message Azure#46: test zip # This is the commit message Azure#47: changes # This is the commit message Azure#48: add windir check # This is the commit message Azure#49: minor fix # This is the commit message Azure#50: get hostname # This is the commit message Azure#51: add expose in dockerfile # This is the commit message Azure#52: add exec collector # This is the commit message Azure#53: mitigate exit code 126 # This is the commit message Azure#54: change curl url from example.com to dp endpoint # This is the commit message Azure#55: changes # This is the commit message Azure#56: uncomment exec # This is the commit message Azure#57: add new diagnoser # This is the commit message Azure#58: debugging # This is the commit message Azure#59: debug # This is the commit message Azure#60: debugging # This is the commit message Azure#61: remove print statements # This is the commit message Azure#62: remove print # This is the commit message Azure#63: add back crd print statement # This is the commit message Azure#64: change # This is the commit message Azure#65: change # This is the commit message Azure#66: update dataPoint name # This is the commit message Azure#67: modify forloop # This is the commit message Azure#68: add filename to datapoint # This is the commit message Azure#69: add back log prints # This is the commit message Azure#70: test # This is the commit message Azure#71: add fields to diagnostic signal # This is the commit message Azure#72: add config content to diagnoser # This is the commit message Azure#73: change format from yaml to json # This is the commit message Azure#74: add parameters for kubeobject config map # This is the commit message Azure#75: Revert "introduce arcmode" This reverts commit 5f4fed4. # This is the commit message Azure#76: fix helm collector style # This is the commit message Azure#77: revert changes that test arc customizations # This is the commit message Azure#78: fix merge conflicts # This is the commit message Azure#79: fix merge conflicts # This is the commit message Azure#80: Revert "Add v0.3 acr image for Private cluster fix. (Azure#22)" This reverts commit 49dd302. # This is the commit message Azure#81: fix merge conflicts # This is the commit message Azure#82: fix merge conflicts # This is the commit message Azure#83: add print statement # This is the commit message Azure#84: update print statement # This is the commit message Azure#85: committed # This is the commit message Azure#86: committed # This is the commit message Azure#87: committed # This is the commit message Azure#88: committed # This is the commit message Azure#89: remove print statements # This is the commit message Azure#90: fix merge conflicts # This is the commit message Azure#91: fix merge conflicts # This is the commit message Azure#92: fix merge conflicts # This is the commit message Azure#93: add repo command # This is the commit message Azure#94: change stable repo name # This is the commit message Azure#95: add write to file # This is the commit message Azure#96: add kured # This is the commit message Azure#97: change # This is the commit message Azure#98: changes # This is the commit message Azure#99: add default namespace # This is the commit message Azure#100: change # This is the commit message Azure#101: add integration test # This is the commit message Azure#102: changes # This is the commit message Azure#103: add helm test # This is the commit message Azure#104: change print statement to error # This is the commit message Azure#105: change # This is the commit message Azure#106: more changes # This is the commit message Azure#107: add go installation # This is the commit message Azure#108: fix unit test # This is the commit message Azure#109: add custom resource collector # This is the commit message Azure#110: fix merge conflicts # This is the commit message Azure#111: comment unused variables # This is the commit message Azure#112: debug exporter # This is the commit message Azure#113: filenames # This is the commit message Azure#114: test zip function # This is the commit message Azure#115: list files # This is the commit message Azure#116: fmt to log # This is the commit message Azure#117: delete lines # This is the commit message Azure#118: changed # This is the commit message Azure#119: get current directory # This is the commit message Azure#120: remove some print statements # This is the commit message Azure#121: test zip # This is the commit message Azure#122: changes # This is the commit message Azure#123: add windir check # This is the commit message Azure#124: minor fix # This is the commit message Azure#125: get hostname # This is the commit message Azure#126: add expose in dockerfile # This is the commit message Azure#127: add exec collector
…p level deployment yaml - to avoid changing the one used by downstream tools (vscode / cli)
(cherry picked from commit 910ed97)
@@ -0,0 +1,192 @@ | |||
apiVersion: v1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 We really need to think more in depth about this please. This should be thought well with specific concrete scenarios and some kind of design doc as to advantages and disadvantages. We are duplicating deployment files and this is kind of a design thought, we/I need to think more and not making this part of a refactor. This should be more broader discussed before we go ahead and duplicate deployment files for small use-cases.
Pros and Cons, with clear advantage will give much better direction, because all these deployment files will still need to handled from single code logical flow, so under the hood all this will go through same logic flow, hence 1 deployment file like it is makes more sense.
📓 Rest looks good I liked the very initial refactor. we can try scoping the configMap
issue which we can see, and if there is a way not to get this delete error. I mean if its just for delete yaml
and we make it sure that its 100% safe operation, we could move forward. But creating separate yaml
will introduce bigger design challenge for tools which consume this.
Possibly exploring “ --ignore-not-found” to be added to the delete command for yaml, possibly update appendix.md. Surely making good test to check rest all scenarios work, and going back to the old one deployment file seems like good solution, if it’s same image I could do some more test tonight. Thanks 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems there are many things in this PR : features, helm chart, interfaces modifications. Could we split this PR or at lease re-arrange the commits so it is easier to read?
@@ -0,0 +1,8 @@ | |||
apiVersion: v1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather use kustomize instead of a helm release. We can discuss this but I tend to avoid helm releases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for confusion - should have cut a new WIP branch and left this PR branch as is before pulling in the helm changes. Consider this PR on hold until we can discuss
@@ -0,0 +1,192 @@ | |||
apiVersion: v1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the difference with charts/azure-k8s-periscope/templates/azure-k8s-periscope.yaml ? Why 2 templates ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this file should have been removed from the PR - the chart template would have been replacing it
This allows selection of which collectors and diagnosers are enabled via changes to the deployment yaml only.