-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gson synchronized map replaced with concurrent hash map #806
Conversation
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed, please reply here (e.g.
|
I signed it! |
CLAs look good, thanks! |
…ackward compatibility
What happens if you extract a TypeAdapter and keep a reference to it? There’s not reason to go through that cache when you can just keep the TypeAdapter you need in a field. |
As I said before, I can't: I'm serializing complex objects which contain maps/collections of other objects, and for every such object custom type adapter is registered (~50 adapters in total), so I need this cache. Otherwise I should write whole Gson class myself to use this adapters directly. There’s not reason to keep obvious performance bottleneck in Gson either :) |
Its a mistake to call into the Gson instance multiple times if you can avoid it. What's your top-level object? You want to lookup the adapter for that exactly once, which will recursively fetch all of its dependent adapters. In other words, rather than removing contention while reading from the map I think your best fix is to not read from the map at all. |
I'm trying to keep Gson usage as simple as I can, so I call Gson#fromJson(object, clazz), which should be very common and obvious use case of Gson. I don't want adapters and other json-related classes leak to my code (at least I want to limit such places). What you are offering is nothing but hack solving problem on wrong level which limits gson usage and forces me to write more duplicate code (which is already written inside of Gson) for reader creation, lenitent check, exception handling and whatever-else-happens-in-fromJson-method (in fact when I got multiple top-level objects in different places I will end up with my own wrapper with gson and concurrent map of adapters inside) and write additional explanation of why I've done this instead of plain gson#fromJson call. Suggestion to rewrite part of gson implementation is not the way such problems should be solved. And my solution just solves the problem in the right place without any performance degradation at all (even for single-threaded case) or incompatible API changes, so everyone will benefit: performance of overall gson usage increases, old code isn't broken, problem solved. In other words, rather than rewriting gson and propagating it's model classes to my code I think best fix is to remove synchronizedMap usage and replace it with something modern with the same contract. |
I agree, if synchronized map is not performing well, we should look at alternatives. |
@swankjesse What is the advantage of using synchronizedMap over ConcurrentMap? |
As with many optimizations, this one is about tradeoffs. Here we have two options:
And there’s actually a third option that makes everyone happy. Avoid using the cache altogether. Lookup a type adapter instance from the Gson instance once, and avoid all contention. This keeps the memory low for mobile & makes servers even faster than the I think TypeAdapter’s |
@swankjesse Gson.toJson()/fromJson() work really well for most use-cases. Getting an adapter and using it is cumbersome, and sometimes doesn't produce correct results (I have had issues in the past where it will serialize nulls incorrectly, or not pretty print). We should not impose our programming style on others. How real is the memory overhead? Gson creates a lot of maps internally (and anonymous inner classes with specific type adapters for various java.* types), this specific ConcurrentMap can't be all that big. From what I read, ConcurrentHashMaps perform really well: I think we should accept this PR. |
Works for me. |
Thanks @qwwdfsad for the pull request and @swankjesse for accommodating it. |
Gson synchronized map replaced with concurrent hash map
@qwwdfsad Also love the detailed write up PR. I am assuming you ran all the tests and they passed.
|
Travis CI runs all the tests. You'll see a big red X if they fail. On Sat, Mar 5, 2016, 4:57 PM inder123 notifications@github.com wrote:
|
@JakeWharton Good point, so yes, not needed. |
Thank you! |
Gson synchronized map replaced with concurrent hash map
Internally Gson class uses Collections#synchronizedMap as a thread-safe map for typeTokenCache.
In a highly-contended environment I'm experiencing unpredictable latency spikes,
while the performance of Gson#toJson degrades significantly even for simple objects.
It's typical for an application thread to hang on typeTokenCache#get average 3-15 ms in such situations.
See this benchmark results to observe this problem (results for patch included).
For a more or less generic serialization code it's impossible to use TypeAdapter directly. One of the solutions might be to use thread-local Gson instances, but from programmer's perspective it makes code more obfuscated, requires an additional explanation on why this was done and floods the heap with duplicate objects. Gson is declared as thread-safe, so it seems reasonable to expect fine-grained concurrency level internally. Moreover, after initial application warmup typeTokenCache becomes a mostly read-only structure (in fact, read-only), so allowing non-blocking reads looks like a perfect fit.
Provided patch replaces synchronized map with j.u.c.ConcurrentHashMap. The main difference between synchronized map and CHM is null keys prohibition, so implicit code-path for null is added.
As you can see in benchmark results, even despite of additional code path for null keys, overall throughput increases even in single-threaded case due to absence of explicit synchronization (though JVM optimizes uncontended locks, some overhead is still present), so there is no risks or performance degradation associated with this patch.