Skip to content
This repository was archived by the owner on Aug 12, 2024. It is now read-only.

Use LookupSession for async lookups#47

Merged
fdfzcq merged 2 commits intomasterfrom
async-lookups-backwards-compatible
Sep 20, 2021
Merged

Use LookupSession for async lookups#47
fdfzcq merged 2 commits intomasterfrom
async-lookups-backwards-compatible

Conversation

@fdfzcq
Copy link
Collaborator

@fdfzcq fdfzcq commented Sep 6, 2021

Original PR by @ibauersachs : #46
This aims to mitigate performance issue caused by lookups using common forkjoin pool as described in dnsjava/dnsjava#211 .
In dnsjava version 3.4.2-SNAPSHOT @ibauersachs has added support for using custom executors in resolvers dnsjava/dnsjava@dbda661
Changes:

  • Add asynchronous methods using LookupSession
  • Allow specifying custom executor in DnsSrvResolver builder
  • Add performance test (but marked Ignored as the test itself is flaky and timing dependent)

Next steps from Spotify side are:

  • Update services/libraries to use async methods with custom executors.
  • Remove deprecated methods and remove LookupFactory class.

@ibauersachs
Copy link
Contributor

I assume this is still WIP?

@fdfzcq
Copy link
Collaborator Author

fdfzcq commented Sep 7, 2021

@ibauersachs yes, we are adding a performance test case and CachingLookupFactory and related classes should be removed. But we can wait with deprecation of LookupFactory and moving tests to junit5 for now.

@fdfzcq
Copy link
Collaborator Author

fdfzcq commented Sep 7, 2021

@ibauersachs When testing LookupSession we found out dnsjava/dnsjava#212 do affect us. We use CNAME records internally to redirect traffic and saw the following error while looking up those records:

Resolving: _spotify-concat._grpc.services.gew1.spotify.net.
_spotify-palindrome._grpc.services.gae2.spotify.net. ... failed!
java.util.concurrent.ExecutionException: com.spotify.dns.DnsException: Lookup of '_spotify-palindrome._grpc.services.gae2.spotify.net.' failed: Multiple CNAME RRs not allowed, see RFC1034 3.6.2 
	at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
	at com.spotify.dns/com.spotify.dns.DnsLookupPerformanceTest.resolve(DnsLookupPerformanceTest.java:87)
	at com.spotify.dns/com.spotify.dns.DnsLookupPerformanceTest.lambda$runTest$0(DnsLookupPerformanceTest.java:58)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: com.spotify.dns.DnsException: Lookup of '_spotify-palindrome._grpc.services.gae2.spotify.net.' failed: Multiple CNAME RRs not allowed, see RFC1034 3.6.2 

@fdfzcq fdfzcq force-pushed the async-lookups-backwards-compatible branch from 8caac86 to 7cb9151 Compare September 7, 2021 11:37
@ibauersachs
Copy link
Contributor

Please use the 3.4.2-SNAPSHOT, it's fixed there. The sendAsync(Message, Executor) overloads are also there since yesterday evening.

@fdfzcq
Copy link
Collaborator Author

fdfzcq commented Sep 7, 2021

@ibauersachs great, I'll try that, thank you!

@ibauersachs
Copy link
Contributor

Note that the performance/deadlock test initially posted in dnsjava/dnsjava#211 isn't reliably failing. My dev-machine has an 8-core CPU and I had to lower the timeout to << 5s to reproduce the problem. If you use Executors.newFixedThreadPool(10) on a powerful machine to run the tests, you might not get the results you expect.

@fdfzcq
Copy link
Collaborator Author

fdfzcq commented Sep 7, 2021

@ibauersachs yes that's a good point, I'm thinking that at this point we might not need DnsLookupPerformanceTest.java

@ibauersachs
Copy link
Contributor

It might still be useful to keep that test, but maybe add @Disabled("Needs network access and is timing dependent"). This will make sure the code gets refactored as the library evolves and you can still execute it on demand and with the limitations pointed out.

@fdfzcq
Copy link
Collaborator Author

fdfzcq commented Sep 7, 2021

@ibauersachs Thank you! We will look into using ResolverConfig, currently we have 200+ services and libraries relying on the synchronous method and we will need to update them first.

@fdfzcq fdfzcq requested a review from a team September 7, 2021 13:23
@fdfzcq
Copy link
Collaborator Author

fdfzcq commented Sep 8, 2021

@ibauersachs thank you for reviewing!

@fdfzcq fdfzcq requested a review from a team September 8, 2021 09:58
@fdfzcq
Copy link
Collaborator Author

fdfzcq commented Sep 8, 2021

@ibauersachs do you plan to make a release of 3.4.2 in dnsjava any time soon? otherwise we could go with the snapshot version.

@ibauersachs
Copy link
Contributor

I don't have a date in mind yet, but most likely not before next week.

@fdfzcq
Copy link
Collaborator Author

fdfzcq commented Sep 13, 2021

@ibauersachs Is it possible for you to make a release early this week? :) We think we can merge this PR but would like to avoid using the snapshot version.

@ibauersachs
Copy link
Contributor

@fdfzcq Sorry, no. There's still a very related issue open in the DoH resolver that needs fixing (locking the initial request and limiting the number of parallel requests blocks the async threadpool). I hope I can finish it this weekend.

Did you do enough tests with the snapshot in the meantime so we can be confident the changes really work as expected and you don't need an immediate 3.4.3 follow-up?

@fdfzcq
Copy link
Collaborator Author

fdfzcq commented Sep 17, 2021

@ibauersachs Thank you for the information, and no worries we can wait for the next release. We've tested resolving using custom executor on a prod service that had issues and can confirm it works for us.

Copy link
Contributor

@erikjoh erikjoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some final nit-picks

ibauersachs and others added 2 commits September 20, 2021 10:38
Add new methods while deprecating old ones to keep backwards compatibility.
Add custom executor to DnsSrvResolvers.java
Use dnsjava version 3.4.2
@fdfzcq fdfzcq force-pushed the async-lookups-backwards-compatible branch from 0003102 to 7558328 Compare September 20, 2021 08:38
@fdfzcq fdfzcq merged commit a9a4a8c into master Sep 20, 2021
@fdfzcq fdfzcq deleted the async-lookups-backwards-compatible branch September 20, 2021 08:40
This was referenced Sep 21, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants