dedup dns children by (rdtype, child) not parent host#3126
Open
liquidsec wants to merge 4 commits into
Open
Conversation
emit_dns_children was keyed on (parent_host, rdtype, child_host), so the same out-of-scope nameserver was re-emitted once per in-scope parent that referenced it. With concentrated providers like Cloudflare or MarkMonitor this multiplied NS/SOA emissions by 1000x+ and flooded downstream module queues.
Contributor
🚀 Performance Benchmark Report
|
both tests relied on the (parent, rdtype, child) emit_dns_children dedup to produce duplicate DNS_NAME/IP_ADDRESS edges across parents. with the new (rdtype, child) dedup these collapse to a single emission.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## dev #3126 +/- ##
=====================================
+ Coverage 90% 90% +1%
=====================================
Files 441 441
Lines 38743 38757 +14
=====================================
+ Hits 34663 34677 +14
Misses 4080 4080 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
DNSResolve.emit_dns_childrendeduped its outgoing DNS_NAME events with the key(parent_host, rdtype, child_host). Including the parent host in the key meant the same out-of-scope NS/SOA/MX/CNAME value was re-emitted once per in-scope parent that referenced it.In a recent scan of ~100 corporate domains, this produced:
beth.ns.cloudflare.comalone was emitted 1,518 times. With Cloudflare and MarkMonitor concentrating many zones onto the same NS pair, the multiplier on consolidated providers is severe.The 16K duplicate distance-1 events all flowed into every scan module that watches
DNS_NAME(dnsbrute, dnscommonsrv, wayback, hunterio, sslcert, excavate, …). Each module had to pull them off its incoming queue, hash them for dedup, run scope andfilter_event, and then drop them — serializing queue throughput and slowing scans.Fix
Drop
event.hostfrom the dedup hash. The same (rdtype, child) pair across different parents is the same child event semantically.Adds
TestDNSResolveSharedNameserverDedupto catch regressions: three in-scope domains share an NS/SOA pair; the test asserts each shared nameserver hostname is emitted exactly once across all parents.