Skip to content

[ML] Clearing JSON memory allocators #30

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 5, 2018
Merged

[ML] Clearing JSON memory allocators #30

merged 6 commits into from
Apr 5, 2018

Conversation

edsavage
Copy link
Contributor

Use scoped JSON memory allocators to avoid excessive memory accumulation
where-ever the JSON output writer is persisted.

Fixes #26

Use scoped JSON memory allocators to avoid excessive memory accumulation
where-ever the JSON output writer is persisted.

Fixes #26
@@ -430,6 +431,10 @@ const CJsonOutputWriter::TStrVec &CJsonOutputWriter::fieldNames(void) const
bool CJsonOutputWriter::writeRow(const TStrStrUMap &dataRowFields,
const TStrStrUMap &overrideDataRowFields)
{
typedef core::CScopedRapidJsonPoolAllocator<core::CRapidJsonConcurrentLineWriter> TScopedAllocator;
Copy link
Contributor

@tveasey tveasey Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per #32, can we switch to using TScopedAllocator = ... here and throughout.

@tveasey
Copy link
Contributor

tveasey commented Apr 3, 2018

Otherwise, this is LGTM. The two questions I have are:

  1. What changed to cause us to write out so many more documents?
  2. Will this lose any advantages of using the pool allocator? Clearly, as is they are too widely scoped and the base allocator will be allocating a lot of extra blocks, but is this the sweet spot in terms of number of allocators to use?

Replacing typedef declarations with 'using' instead.
@edsavage
Copy link
Contributor Author

edsavage commented Apr 3, 2018

Thanks for the review comments @tveasey.

The extra Json documents in use in CForecastDataSink came in to existence when I refactored the Json output writer and memory allocator.

The scoped memory allocator is cached, it does not actually get destroyed at end of scope. Rather, it simply calls clear on the underlying pool allocator. There is very little overhead compared to a raw pool allocator, which would need an explicit call to clear when done with it.

Copy link
Contributor

@tveasey tveasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for clarifying! LGTM.

@hendrikmuhs
Copy link

Thanks @edsavage!

Let me check this out and test using my test procedure described in #26.

Added comments to CRapidJsonWriterBase to clarify the intent and purpose
of the allocator stack and the interaction with CScopedRapidJsonPoolAllocator
// for the writer at nested scope.
//
// Note that allocators are not destroyed by the pop operation, they persist for the lifetime of the
// writer in a cache for swift retrieval.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use //! comments so that this is included in the Doxygen output.

@@ -100,6 +100,18 @@ class CRapidJsonWriterBase : public JSON_WRITER<OUTPUT_STREAM, SOURCE_ENCODING,
public:
using TRapidJsonWriterBase = JSON_WRITER<OUTPUT_STREAM, SOURCE_ENCODING, TARGET_ENCODING, STACK_ALLOCATOR, WRITE_FLAGS>;

// Instances of this class may very well be long lived, potentially for the lifetime of the application.
// Over the course of that lifetimme resources will accumulate in the underlying rapidjson memory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lifetimme |-> lifetime

Copy link

@hendrikmuhs hendrikmuhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hendrikmuhs
Copy link

I tested it and can not reproduce #26 with this fix.

@edsavage edsavage merged commit efa2fb5 into elastic:master Apr 5, 2018
edsavage added a commit that referenced this pull request Apr 5, 2018
Use scoped JSON memory allocators to avoid excessive memory accumulation
where-ever the JSON output writer is persisted.

Fixes #26
@edsavage edsavage deleted the memory_accumulation branch April 5, 2018 10:01
droberts195 pushed a commit that referenced this pull request Apr 23, 2018
Use scoped JSON memory allocators to avoid excessive memory accumulation
where-ever the JSON output writer is persisted.

Fixes #26
droberts195 pushed a commit that referenced this pull request Apr 23, 2018
[ML] Clearing JSON memory allocators

Use scoped JSON memory allocators to avoid excessive memory accumulation
where-ever the JSON output writer is persisted.

Fixes #26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants