-
-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Further thoughts on performance improvements #418
Comments
Another room for improvement: reference at(const char* const key)
{
static thread_local object_t::key_type k;
k.assign(key);
return at(key); // call the original `at`
} Having said that, modern compilers use SBO-optimized std::string implementations, so the needless malloc/free churn when creating a temporary string occurs only in GCC<5, so this might be not worth the effort. |
I recently (bd0326c, b1441f3, 0f04e42) started with some micro-optimizations using cachegrind to measure the number of executed instructions. This may not be a precise measurement that can be transferred to any architecture, but it helps to get a feeling where time is actually spent. Of course, wall-clock benchmarks are also required. I shall continue focussing on the (Sorry for hijacking this issue, but I did not find a better place to dump these thoughts) |
This is precisely what the ticket was about, so you're not hijacking anything : ) Another place for optimization is unescaping unicode. Currently the code allocates a string for every character (not really a problem with SBO string implementations). |
Moved all dump()-related functions into a class "serializer". This fix includes a lot of performance improvements yielding a 7% speedup for serialization. Details on the individual steps can be found in the commit messages. Individual benchmark numbers: before: dump jeopardy.json 5 374555228 ns/op dump jeopardy.json with indent 5 430953700 ns/op dump numbers/floats.json 5 622938509 ns/op dump numbers/signed_ints.json 20 82177979 ns/op after: dump jeopardy.json 5 335449757 ns/op -11% dump jeopardy.json with indent 5 375467773 ns/op -13% dump numbers/floats.json 5 584611852 ns/op -7% dump numbers/signed_ints.json 20 68289574 ns/op -17%
I tried adding an optimizing overload for string literals, but in both cases this results in ambiguous overload resolution : / // prevent heap-allocating construction of a temporary string
// when accessing by string literal by reusing a thread_local storage.
#if 0
const_reference operator[](const char* key const) const
#else
template<int N>
const_reference operator[](const char (&key)[N]) const
#endif
{
static thread_local typename object_t::key_type s;
s.assign(key);
return (*this)[s];
} |
You must add some enable_if to avoid the array-to -pointer decay, the |
Internally, the parser now writes its result into a JSON value provided as a reference. To be usable, the public interfaces need to be extended.
I think the ideas here are realized by now. |
My analysis of the performance bottlenecks showed that passthrough (non-instantiating) parsing performance is comparable to that of
wc
(i.e. near-perfect), and the major bottleneck is the usual bane of performant programming -malloc
s andfree
s of many small objects. Can this be improved without resorting to custom allocators game? I think so. The performance becomes an issue when there are many json records to parse. In real-world scenarios the records will have the same schema - just the values that are different. Under this assumption, if we have an old json object that is no longer needed, it can be reused to update the values within, rather than let it expire,free
ing the contents, and then allocate everything de-novo in a newjson
. This would entail having a function likejson::parse(istream& istr, json& dest)
that would take an existing record and reuse the existing fields where possible, making allocations only when necessary (longer strings, longer arrays, keys in maps that are absent in the old object, etc).The text was updated successfully, but these errors were encountered: