Gson synchronized map replaced with concurrent hash map #806

qwwdfsad · 2016-03-04T21:40:58Z

Internally Gson class uses Collections#synchronizedMap as a thread-safe map for typeTokenCache.
In a highly-contended environment I'm experiencing unpredictable latency spikes,
while the performance of Gson#toJson degrades significantly even for simple objects.
It's typical for an application thread to hang on typeTokenCache#get average 3-15 ms in such situations.
See this benchmark results to observe this problem (results for patch included).

For a more or less generic serialization code it's impossible to use TypeAdapter directly. One of the solutions might be to use thread-local Gson instances, but from programmer's perspective it makes code more obfuscated, requires an additional explanation on why this was done and floods the heap with duplicate objects. Gson is declared as thread-safe, so it seems reasonable to expect fine-grained concurrency level internally. Moreover, after initial application warmup typeTokenCache becomes a mostly read-only structure (in fact, read-only), so allowing non-blocking reads looks like a perfect fit.

Provided patch replaces synchronized map with j.u.c.ConcurrentHashMap. The main difference between synchronized map and CHM is null keys prohibition, so implicit code-path for null is added.
As you can see in benchmark results, even despite of additional code path for null keys, overall throughput increases even in single-threaded case due to absence of explicit synchronization (though JVM optimizes uncontended locks, some overhead is still present), so there is no risks or performance degradation associated with this patch.

googlebot · 2016-03-04T21:41:02Z

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.

If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
If you signed the CLA as a corporation, please let us know the company's name.

qwwdfsad · 2016-03-04T21:44:11Z

I signed it!

googlebot · 2016-03-04T21:44:12Z

CLAs look good, thanks!

…ackward compatibility

swankjesse · 2016-03-05T02:50:49Z

What happens if you extract a TypeAdapter and keep a reference to it? There’s not reason to go through that cache when you can just keep the TypeAdapter you need in a field.

qwwdfsad · 2016-03-05T09:01:37Z

As I said before, I can't: I'm serializing complex objects which contain maps/collections of other objects, and for every such object custom type adapter is registered (~50 adapters in total), so I need this cache. Otherwise I should write whole Gson class myself to use this adapters directly. There’s not reason to keep obvious performance bottleneck in Gson either :)

swankjesse · 2016-03-05T13:01:52Z

Its a mistake to call into the Gson instance multiple times if you can avoid it. What's your top-level object? You want to lookup the adapter for that exactly once, which will recursively fetch all of its dependent adapters.

In other words, rather than removing contention while reading from the map I think your best fix is to not read from the map at all.

qwwdfsad · 2016-03-05T14:51:35Z

I'm trying to keep Gson usage as simple as I can, so I call Gson#fromJson(object, clazz), which should be very common and obvious use case of Gson. I don't want adapters and other json-related classes leak to my code (at least I want to limit such places). What you are offering is nothing but hack solving problem on wrong level which limits gson usage and forces me to write more duplicate code (which is already written inside of Gson) for reader creation, lenitent check, exception handling and whatever-else-happens-in-fromJson-method (in fact when I got multiple top-level objects in different places I will end up with my own wrapper with gson and concurrent map of adapters inside) and write additional explanation of why I've done this instead of plain gson#fromJson call. Suggestion to rewrite part of gson implementation is not the way such problems should be solved. And my solution just solves the problem in the right place without any performance degradation at all (even for single-threaded case) or incompatible API changes, so everyone will benefit: performance of overall gson usage increases, old code isn't broken, problem solved.

In other words, rather than rewriting gson and propagating it's model classes to my code I think best fix is to remove synchronizedMap usage and replace it with something modern with the same contract.

inder123 · 2016-03-05T18:32:09Z

I agree, if synchronized map is not performing well, we should look at alternatives.
My question is what do we lose? Does the performance degrade in normal operations?

inder123 · 2016-03-05T18:33:46Z

@swankjesse What is the advantage of using synchronizedMap over ConcurrentMap?
I think we should accept this change.

swankjesse · 2016-03-05T19:09:53Z

As with many optimizations, this one is about tradeoffs. Here we have two options:

The synchronized map has a small memory footprint. It’s best for mobile.
The ConcurrentHashMap has low contention under heavy load. This is best on servers.

And there’s actually a third option that makes everyone happy. Avoid using the cache altogether. Lookup a type adapter instance from the Gson instance once, and avoid all contention. This keeps the memory low for mobile & makes servers even faster than the ConcurrentHashMap.

I think TypeAdapter’s toJson() and fromJson() methods have better behavior than similar methods on Gson; these don’t do weird things for backwards compatibility like force leniency or hide exceptions.

inder123 · 2016-03-05T20:10:38Z

@swankjesse Gson.toJson()/fromJson() work really well for most use-cases. Getting an adapter and using it is cumbersome, and sometimes doesn't produce correct results (I have had issues in the past where it will serialize nulls incorrectly, or not pretty print). We should not impose our programming style on others.

How real is the memory overhead? Gson creates a lot of maps internally (and anonymous inner classes with specific type adapters for various java.* types), this specific ConcurrentMap can't be all that big. From what I read, ConcurrentHashMaps perform really well:
http://crunchify.com/hashmap-vs-concurrenthashmap-vs-synchronizedmap-how-a-hashmap-can-be-synchronized-in-java/
http://stackoverflow.com/questions/510632/whats-the-difference-between-concurrenthashmap-and-collections-synchronizedmap

I think we should accept this PR.

swankjesse · 2016-03-05T21:40:25Z

Works for me.

inder123 · 2016-03-05T21:54:10Z

Thanks @qwwdfsad for the pull request and @swankjesse for accommodating it.
👍

Gson synchronized map replaced with concurrent hash map

inder123 · 2016-03-05T21:57:39Z

@qwwdfsad Also love the detailed write up PR. I am assuming you ran all the tests and they passed.
Ideally, in the PR you should cut-n-paste command-line print to show that. Something like:

Results :

Tests run: 986, Failures: 0, Errors: 0, Skipped: 0

[INFO] 
[INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ gson ---
[INFO] Adding existing MANIFEST to archive. Found under: /Users/inder/tmp/gson/gson/target/classes/META-INF/MANIFEST.MF
[INFO] Building jar: /Users/inder/tmp/gson/gson/target/gson-2.6.3-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ gson ---
[INFO] Installing /Users/inder/tmp/gson/gson/target/gson-2.6.3-SNAPSHOT.jar to /Users/inder/.m2/repository/com/google/code/gson/gson/2.6.3-SNAPSHOT/gson-2.6.3-SNAPSHOT.jar
[INFO] Installing /Users/inder/tmp/gson/gson/pom.xml to /Users/inder/.m2/repository/com/google/code/gson/gson/2.6.3-SNAPSHOT/gson-2.6.3-SNAPSHOT.pom
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Gson Parent ........................................ SUCCESS [  1.671 s]
[INFO] Gson ............................................... SUCCESS [  5.048 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS

JakeWharton · 2016-03-05T21:58:54Z

Travis CI runs all the tests. You'll see a big red X if they fail.

On Sat, Mar 5, 2016, 4:57 PM inder123 notifications@github.com wrote:

@qwwdfsad https://github.com/qwwdfsad Also love the detailed write up
PR. I am assuming you ran all the tests and they passed.
Ideally, in the PR you should cut-n-paste command-line print to show that.
Something like:

Results :

Tests run: 986, Failures: 0, Errors: 0, Skipped: 0

[INFO]
[INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ gson ---
[INFO] Adding existing MANIFEST to archive. Found under: /Users/inder/tmp/gson/gson/target/classes/META-INF/MANIFEST.MF
[INFO] Building jar: /Users/inder/tmp/gson/gson/target/gson-2.6.3-SNAPSHOT.jar
[INFO]
[INFO] --- maven-install-plugin:2.4:install (default-install) @ gson ---
[INFO] Installing /Users/inder/tmp/gson/gson/target/gson-2.6.3-SNAPSHOT.jar to /Users/inder/.m2/repository/com/google/code/gson/gson/2.6.3-SNAPSHOT/gson-2.6.3-SNAPSHOT.jar
[INFO] Installing /Users/inder/tmp/gson/gson/pom.xml to /Users/inder/.m2/repository/com/google/code/gson/gson/2.6.3-SNAPSHOT/gson-2.6.3-SNAPSHOT.pom
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Gson Parent ........................................ SUCCESS [ 1.671 s]
[INFO] Gson ............................................... SUCCESS [ 5.048 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS

—
Reply to this email directly or view it on GitHub
#806 (comment).

inder123 · 2016-03-05T21:59:24Z

@JakeWharton Good point, so yes, not needed.

qwwdfsad · 2016-03-06T12:05:23Z

Thank you!
Any chances there will be 2.6.3 soon?

Gson synchronized map replaced with concurrent hash map

Gson synchronized map replaced with concurrent hash map

c561184

More appropriate usage of null key surrogate in Gson#getAdapter for b…

31dcfa3

…ackward compatibility

inder123 self-assigned this Mar 5, 2016

inder123 added a commit that referenced this pull request Mar 5, 2016

Merge pull request #806 from qwwdfsad/master

a02f575

Gson synchronized map replaced with concurrent hash map

inder123 merged commit a02f575 into google:master Mar 5, 2016

sgbrown mentioned this pull request Mar 10, 2016

respect the lenient flag in Gson.fromJson #804

Closed

sebasjm pushed a commit to sebasjm/gson that referenced this pull request Mar 11, 2018

Merge pull request google#806 from qwwdfsad/master

2461cd8

Gson synchronized map replaced with concurrent hash map

Marcono1234 mentioned this pull request Jul 25, 2022

Gson typeToken cache contentions #570

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gson synchronized map replaced with concurrent hash map #806

Gson synchronized map replaced with concurrent hash map #806

qwwdfsad commented Mar 4, 2016

googlebot commented Mar 4, 2016

qwwdfsad commented Mar 4, 2016

googlebot commented Mar 4, 2016

swankjesse commented Mar 5, 2016

qwwdfsad commented Mar 5, 2016

swankjesse commented Mar 5, 2016

qwwdfsad commented Mar 5, 2016

inder123 commented Mar 5, 2016

inder123 commented Mar 5, 2016

swankjesse commented Mar 5, 2016

inder123 commented Mar 5, 2016

swankjesse commented Mar 5, 2016

inder123 commented Mar 5, 2016

inder123 commented Mar 5, 2016

JakeWharton commented Mar 5, 2016

inder123 commented Mar 5, 2016

qwwdfsad commented Mar 6, 2016

Gson synchronized map replaced with concurrent hash map #806

Gson synchronized map replaced with concurrent hash map #806

Conversation

qwwdfsad commented Mar 4, 2016

googlebot commented Mar 4, 2016

qwwdfsad commented Mar 4, 2016

googlebot commented Mar 4, 2016

swankjesse commented Mar 5, 2016

qwwdfsad commented Mar 5, 2016

swankjesse commented Mar 5, 2016

qwwdfsad commented Mar 5, 2016

inder123 commented Mar 5, 2016

inder123 commented Mar 5, 2016

swankjesse commented Mar 5, 2016

inder123 commented Mar 5, 2016

swankjesse commented Mar 5, 2016

inder123 commented Mar 5, 2016

inder123 commented Mar 5, 2016

JakeWharton commented Mar 5, 2016

inder123 commented Mar 5, 2016

qwwdfsad commented Mar 6, 2016