-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement anonymizer's main program #2621
Conversation
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "Validation step is not working because the difference in the expected and actual json"? Which validation?
Hi @yurishkuro Thank you so much for the review. I will fix them. By saying "validation step is not working", I meant I am unable to load the anonymised json from the json tab in the search panel due to the difference in the produced json. It keeps rotating while loading the json from the file. After manually modifying the json to the expected.json (https://gist.github.com/Ashmita152/e9b421819571098d3c9de965b6ae689c) it works as expected. |
I assume you're talking about the UI. Is there an error in the Javascript console? |
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Hi @yurishkuro I have updated the PR based on your feedbacks. Those were great suggestions. Hopefully I didn't miss any of those. It's ready for review again.
There is no error in javascript console. No error logs in the all-in-one docker image too. Thank you once again for the help. |
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Hi @yurishkuro Thank you for the review again. Those definitely made the PR lot better. I have tried my best to fix those and it would be great to have your review again. Regards. |
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Codecov Report
@@ Coverage Diff @@
## master #2621 +/- ##
==========================================
- Coverage 95.07% 95.06% -0.01%
==========================================
Files 209 210 +1
Lines 9368 9410 +42
==========================================
+ Hits 8907 8946 +39
- Misses 385 388 +3
Partials 76 76
Continue to review full report at Codecov.
|
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
…nto anonymizer-2 Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Hi @yurishkuro Thank you for the feedbacks. I tried my best to update the PR based on my understanding. May I ask for your review again please. Regards. |
|
||
// Close closes the captured and anonymized files. | ||
func (w *Writer) Close() { | ||
w.capturedFile.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're missing things like w.capturedFile.WriteString("\n]\n")
. See L132 - that code should be moved into Close.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how did you test it? are you able to load the resulting fie in the UI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @yurishkuro
This is the output from a sample run: https://gist.github.com/Ashmita152/68904c2ffe2dc40bf17984ffc981cace
It doesn't load in the UI since the json isn't as expected I guess, no errors in js console: https://gist.github.com/Ashmita152/e9b421819571098d3c9de965b6ae689c
I am unable to figure out the reason about this inconsistency but my guess is that UI accepts a complete trace json while we are uploading one particular span json.
I have a question: why do we write span by span while writing to file, shouldn't we be first constructing a trace out of those spans and then write to file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I see the confusion. The original purpose of the Writer was to use it while tailing a long stream of spans (in Kafka) to share them with researchers. It does not have the notion of a trace because spans are not ordered/grouped by trace ID in a stream. We had another internal binary that used these packages as a library.
For the purpose of this ticket, we know that we're dealing with a single trace. However, the Writer does not have the logic to write that trace in exactly the UI format, so you may need to add that. You can download a sample trace from the UI (via JSON button in the trace view) to see where the formatting is different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, that clears my doubt. Sure I will work on implementing that in writer. Do we need to keep the functionality backward compatible ie. allowing current tailing stream of spans too ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, please keep backward compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused about how to implement the logic to write the trace in the UI format. To me, it seems like we can't use WriteSpan function. I am thinking to do:
- Implement a WriteTrace function in writer.go
- Change QueryTrace in query.go which is now returning []model.Span. I am thinking to modify it to return model.Trace.
- In main.go instead of calling writer.writeSpan, call writer.writerTrace.
I would like to get your opinion before start implementing this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, the writer is not exactly compatible with saving a single trace. Let's merge this PR, since it achieved the basic anonymization of a trace, just doesn't output UI-compatible file, which can be added separately.
TBH, our internal program worked differently, in two stages - first we'd run it on a stream extracting lots of spans into an anonymized file. Then we had a separate command in the binary that would take a trace ID, scan the previously save file, pick the spans for the given trace ID, and output just that trace in the UI format. It would be weird to do it the same way here because the first run is already taking a trace ID as input (unlike our case where input was undifferentiated stream of spans from many traces). However, we could still implement it the same way, just invoke the second step automatically on the file already generated by your current code.
Let me do it as a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
* Bump opentelemetry-collector to v0.14.0 (jaegertracing#2617) * bump opentelemetry-collector to v0.14.0 Signed-off-by: Pavel Kositsyn <kositsyn.pa@phystech.edu> * initialize traceid and spanid explicitly Signed-off-by: Pavel Kositsyn <kositsyn.pa@phystech.edu> * fix comments and empty parent span check Signed-off-by: Pavel Kositsyn <kositsyn.pa@phystech.edu> * fasten convert traceID/spanID Signed-off-by: Pavel Kositsyn <kositsyn.pa@phystech.edu> * fix convertTraceID + fix tests Signed-off-by: Pavel Kositsyn <kositsyn.pa@phystech.edu> * Update CodeQL to latest best practices (jaegertracing#2615) This will parallelize your analysis and speed things up a bunch. Signed-off-by: jhutchings1 <jhutchings1@users.noreply.github.com> Co-authored-by: Juraci Paixão Kröhling <juraci@kroehling.de> * Fix flaky TestReload (jaegertracing#2624) Signed-off-by: albertteoh <albert.teoh@logz.io> * Update x/text to v0.3.4 (jaegertracing#2625) Signed-off-by: Gary Brown <gary@brownuk.com> * Bump to latest UI for snapshot builds (jaegertracing#2626) Signed-off-by: Yuri Shkuro <github@ysh.us> * Implement anonymizer's main program (jaegertracing#2621) * Preparing release 1.21.0 (jaegertracing#2630) * updated changelog Signed-off-by: Joe Elliott <number101010@gmail.com> * Added ui changelog Signed-off-by: Joe Elliott <number101010@gmail.com> * Fixed UI changelog to point to 1.12.0 Signed-off-by: Joe Elliott <number101010@gmail.com> * Updated jaeger-ui to v1.12.0 Signed-off-by: Joe Elliott <number101010@gmail.com> * Resolving concerns Signed-off-by: Joe Elliott <number101010@gmail.com> * [anonymizer] Save trace in UI format (jaegertracing#2629) * Use fossa-contrib/fossa-action instead (jaegertracing#2571) * Use fossa-contrib/fossa-action instead Signed-off-by: Sora Morimoto <sora@morimoto.io> * Make step name clearer Signed-off-by: Sora Morimoto <sora@morimoto.io> Co-authored-by: Yuri Shkuro <yurishkuro@users.noreply.github.com> * Update Makefile and Dockerfile for anonymizer (jaegertracing#2632) Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com> * Fix listen IP in unit test (jaegertracing#2636) Signed-off-by: zouyu <zouy.fnst@cn.fujitsu.com> * Bump opentelemetry to v0.15.0 (jaegertracing#2634) * Bump opentelemetry to v0.15.0 Signed-off-by: Pavel Kositsyn <kositsyn.pa@phystech.edu> * add default value instead of nil value for jaegerreceiver config Signed-off-by: Pavel Kositsyn <kositsyn.pa@phystech.edu> * make lint Signed-off-by: Pavel Kositsyn <kositsyn.pa@phystech.edu> Co-authored-by: Kositsyn Pavel <vemmy124@gmail.com> Co-authored-by: Justin Hutchings <jhutchings1@users.noreply.github.com> Co-authored-by: Juraci Paixão Kröhling <juraci@kroehling.de> Co-authored-by: Albert <26584478+albertteoh@users.noreply.github.com> Co-authored-by: Gary Brown <gary.pi4tech@gmail.com> Co-authored-by: Yuri Shkuro <yurishkuro@users.noreply.github.com> Co-authored-by: Ashmita <ashmita.bohara152@gmail.com> Co-authored-by: Joe Elliott <number101010@gmail.com> Co-authored-by: Sora Morimoto <sora@morimoto.io> Co-authored-by: ZouYu <zouy.fnst@cn.fujitsu.com> Co-authored-by: Kositsyn Pavel <kositsyn.pa@phystech.edu>
Which problem is this PR solving?
Implement anonymizer's main program, per #2556
Short description of the changes
Validation
Validation step is not working because the difference in the expected and actual json.
https://gist.github.com/Ashmita152/e9b421819571098d3c9de965b6ae689c
I am unable to figure out why this difference is happening and what should be the right way to fix.