-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add responseless dns requests to dns_data table. #613
Add responseless dns requests to dns_data table. #613
Conversation
Can one of the admins verify this patch? |
Changed Stichframes() to include the DNS requests in dns_data table which could not be matched with a response. Signed-off-by: Noman_ali_bajwa <nmnbajwa@gmail.com>
ok to test |
Changed Bazel.build file to include dependacny for stitcher_test.
Hey folks, how can i see the jenkins build log. The details hyperlink takes me to google sigin to continue to Pixie labs which says "Access blocked: Pixie Labs can only be used within its organization" |
We're working on improving our process so that the build logs are public. But for now, I DM'd you the logs. |
BLOCKED Hey folks, Jenkins build is failing with this error. Any pointer on this one ?
|
@Noman-Ali-Bajwa I think this is because the Jenkins build expects the diff to be pre-linted. We should figure out how to get the linters to run on your local environment. |
Signed-off-by: Noman Ali Bajwa <nmnbajwa@gmail.com>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: noman-xg The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1 similar comment
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: noman-xg The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Noman-Ali-Bajwa,
Thanks so much for this preview PR! So glad to have your contribution.
This is a great start. Leaving a few comments for discussion below.
// CustomStitchFrames is a re-implementation of the existing StitchFrames with an added | ||
// ability to record those DNS requests as well for which we could not match any response. | ||
// Flag include_respless_dns_requests must be set to include such DNS requests. | ||
RecordsWithErrorCount<Record> CustomStitchFrames(std::deque<Frame>* req_frames, | ||
std::deque<Frame>* resp_frames) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The high degree of similarity of this function with PxStitchFrames
suggests we should merge them and pass in an boolean argument that selects the behavior. Otherwise there's too much duplication, which could be a problem if there's some bug fix we need to make in the future.
Also, I know you asked about this on Slack. Thanks for putting this preview out so I could get a better grasp on how much duplication there actually ends up being :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll merge the two and push the changes soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in a3e9243
// After the external loop's lifecycle comes to an end we end up with the request deque | ||
// having only those request frames which have not been consumed yet i.e. consumed = false | ||
// so essentially these are the requests which could not be matched with any response frame. | ||
// Hence we iterate over this request deque, add a default response to it, make a record and | ||
// append those records at the end of the entries vector. | ||
auto it = req_frames->begin(); | ||
while (it != req_frames->end()) { | ||
if (!(*it).consumed) { | ||
Frame default_resp_frame; | ||
default_resp_frame.timestamp_ns = 99; | ||
StatusOr<Record> record_status = ProcessReqRespPair(*it, default_resp_frame); | ||
entries.push_back(record_status.ConsumeValueOrDie()); | ||
} | ||
it++; | ||
} | ||
return {entries, error_count}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the new part. A couple cases that we'll probably have to handle:
-
What if a request can't be matched with a response because the response is still in flight? The parser could run at any time, so we we have to be ready to handle that case. We might need to build in some assumptions about how delayed a response can be.
-
What's the rationale on
default_resp_frame.timestamp_ns = 99;
? We'll have to be careful with that because it could affect latency calculations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
What should we consider as a timeout for the dns request. I think we can handle that using timestamp_ns.
-
I have a workaround in mind to remove this. Will change in next commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For (1), there's no set number, but nslookup uses a default of 2 seconds if I'm not mistaken. So that could be a starting point. Whatever we choose, it should definitely be a FLAG so we can change it later if required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
timeout added in a3e9243
Made changes to set a timeout on DNS requests. Signed-off-by: Noman Ali Bajwa <nmnbajwa@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Noman-Ali-Bajwa for making the updates. Here are a few more on the latest version.
// append those records at the end of the entries vector. | ||
auto it = req_frames->begin(); | ||
while (it != req_frames->end()) { | ||
if (!(*it).consumed) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: if (!it->consumed)
auto current_time = std::chrono::duration_cast<std::chrono::milliseconds>( | ||
std::chrono::system_clock::now().time_since_epoch()) | ||
.count(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting the clock can become expensive when inside a loop. Can you get the current_time a single time outside the loop and use the fixed value inside the loop. While not strictly as accurate, it should be more than sufficient for this use case.
auto current_time = std::chrono::duration_cast<std::chrono::milliseconds>( | ||
std::chrono::system_clock::now().time_since_epoch()) | ||
.count(); | ||
auto elapsed_seconds = current_time - (*it).timestamp_ns; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: it->timestsamp_ns
|
||
if (elapsed_seconds > FLAGS_dns_request_timeout_threshold_milliseconds) { | ||
Frame default_resp_frame; | ||
default_resp_frame.timestamp_ns = 99; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still thinking about this one. At minimum, we could use the time of the timeout as the "response" time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll get it out of the way today.
Changed dummy response_frame timestamp to be increment of request timestamp. Signed-off-by: Noman Ali Bajwa <nmnbajwa@gmail.com>
…y timeout threshold. Signed-off-by: Noman Ali Bajwa <nmnbajwa@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. One additional small comment. Thanks @Noman-Ali-Bajwa!
DEFINE_uint64(dns_request_timeout_threshold_milliseconds, 2000, | ||
"Number of seconds to wait for the in-flight response of a dns request."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add, in the description, that this flag depends on include_respless_dns_requests
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in ef1516a
Also, feel free to take off the |
Signed-off-by: Noman Ali Bajwa <nmnbajwa@gmail.com>
This pull request is related to #418 which relates to including those DNS requests into the dns_data table for which a matching response was not found.
Testing Done