Skip to content

Memory allocation research #4509

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from
Closed

Memory allocation research #4509

wants to merge 11 commits into from

Conversation

codebrain
Copy link
Contributor

The following PR has resulted from an investigation into memory allocation within the client.

System.Buffers.ArrayPool

  • In instances where a byte[] is allocated from the ArrayPool and needs to be resized, it is now correctly returned to the ArrayPool first and then a new buffer with the larger length is rented. This prevents "over-renting" from the pool and pushing the underlying implementation to force Gen2 cleanup at a later date.
  • The ISO8601TimeSpanFormatter was renting from the ArrayPool (default 64Kb buffer length) for a simple day calculation. In this instance it is likely more efficient to use a local byte[]

RecyclableMemoryStreamFactory

The client now uses RecyclableMemoryStreamFactory everywhere in instances where the interface is accepted. The MemoryStreamFactory class is persisted for use by customers in instances where they prefer to new() memory streams (e.g. in cron jobs and short-lived applications).

Pretty Printing

This dead code has been removed.

@russcam russcam self-assigned this Mar 25, 2020
Copy link
Contributor

@russcam russcam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great 👍

Do you have

  • benchmark runs that compare latency and allocations before this change, and after? I think benchmarks with bulk and search should demonstrate
  • some perfview screenshots that show rented and returned counts for an integration test run?

I think both of these would be good to share more widely, and would serve as artifacts that we can come back to later.

@Mpdreamz
Copy link
Member

Several integration tests are failing:

https://dev.azure.com/azure3/elasticsearch-net/_build/results?buildId=3412&view=logs&j=887c6f23-4c3d-5668-6783-9f77056b8369&t=231ee5e6-21f5-558b-5c1b-be75406f6d88&l=1855

Which indicate some offset issues in the buffers that get emitted.

Collapsed output here to preserve details over time # Audit trail of this API call: - [1] BadResponse: Node: http://localhost:9200/ Took: 00:00:00.0262053 # OriginalException: Elasticsearch.Net.ElasticsearchClientException: Request failed to execute. Call: Status code 500 from: POST /project/_search?pretty=true&error_trace=true&typed_keys=false. ServerError: Type: json_parse_exception Reason: "Unrecognized token 'faultSerializer': was expecting 'null', 'true', 'false' or NaN at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@3a8582b5; line: 1, column: 17]" # Request: faultSerializer" : "written" "field":"numberOfCommits","ranges":[{"to":100.0},{"from":100.0,"to":500.0},{"from":500.0}]}}},"query":{"term":{"type":{"value":"project"}}},"size":0} # Response: { "error" : { "root_cause" : [ { "type" : "json_parse_exception", "reason" : "Unrecognized token 'faultSerializer': was expecting 'null', 'true', 'false' or NaN\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@3a8582b5; line: 1, column: 17]", "stack_trace" : "[Unrecognized token 'faultSerializer': was expecting 'null', 'true', 'false' or NaN\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@3a8582b5; line: 1, column: 17]]; nested: JsonParseException[Unrecognized token 'faultSerializer': was expecting 'null', 'true', 'false' or NaN\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@3a8582b5; line: 1, column: 17]];\n\tat org.elasticsearch.ElasticsearchException.guessRootCauses(ElasticsearchException.java:639)\n\tat org.elasticsearch.ElasticsearchException.generateFailureXContent(ElasticsearchException.java:567)\n\tat org.elasticsearch.rest.BytesRestResponse.build(BytesRestResponse.java:138)\n\tat org.elasticsearch.rest.BytesRestResponse.(BytesRestResponse.java:96)\n\tat org.elasticsearch.rest.BytesRestResponse.(BytesRestResponse.java:91)\n\tat org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:243)\n\tat org.elasticsearch.rest.RestController.tryAllHandlers(RestController.java:337)\n\tat org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:174)\n\tat org.elasticsearch.http.AbstractHttpServerTransport.dispatchRequest(AbstractHttpServerTransport.java:317)\n\tat org.elasticsearch.http.AbstractHttpServerTransport.handleIncomingRequest(AbstractHttpServerTransport.java:367)\n\tat org.elasticsearch.http.AbstractHttpServerTransport.incomingRequest(AbstractHttpServerTransport.java:296)\n\tat org.elasticsearch.http.netty4.Netty4HttpRequestHandler.channelRead0(Netty4HttpRequestHandler.java:66)\n\tat org.elasticsearch.http.netty4.Netty4HttpRequestHandler.channelRead0(Netty4HttpRequestHandler.java:31)\n\tat io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)\n\tat org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.channelRead(Netty4HttpPipeliningHandler.java:58)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)\n\tat io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)\n\tat io.netty.handler.codec.MessageToMessageCodec.channelRead(MessageToMessageCodec.java:111)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)\n\tat io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)\n\tat io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)\n\tat io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:323)\n\tat io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:297)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)\n\tat io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)\n\tat io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)\n\tat io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965)\n\tat io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:656)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:556)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:510)\n\tat io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470)\n\tat io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909)\n\tat java.base/java.lang.Thread.run(Thread.java:835)\nCaused by: com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'faultSerializer': was expecting 'null', 'true', 'false' or NaN\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@3a8582b5; line: 1, column: 17]\n\tat com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1702)\n\tat com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:558)\n\tat com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidToken(UTF8StreamJsonParser.java:3528)\n\tat com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidToken(UTF8StreamJsonParser.java:3502)\n\tat com.fasterxml.jackson.core.json.UTF8StreamJsonParser._matchToken(UTF8StreamJsonParser.java:2824)\n\tat com.fasterxml.jackson.core.json.UTF8StreamJsonParser._nextTokenNotInObject(UTF8StreamJsonParser.java:855)\n\tat com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:772)\n\tat org.elasticsearch.common.xcontent.json.JsonXContentParser.nextToken(JsonXContentParser.java:52)\n\tat org.elasticsearch.search.builder.SearchSourceBuilder.parseXContent(SearchSourceBuilder.java:1050)\n\tat org.elasticsearch.rest.action.search.RestSearchAction.parseSearchRequest(RestSearchAction.java:131)\n\tat org.elasticsearch.rest.action.search.RestSearchAction.lambda$prepareRequest$1(RestSearchAction.java:110)\n\tat org.elasticsearch.rest.RestRequest.withContentOrSourceParamParserOrNull(RestRequest.java:443)\n\tat org.elasticsearch.rest.action.search.RestSearchAction.prepareRequest(RestSearchAction.java:109)\n\tat org.elasticsearch.rest.BaseRestHandler.handleRequest(BaseRestHandler.java:92)\n\tat org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:240)\n\t... 48 more\n" } ], "type" : "json_parse_exception", "reason" : "Unrecognized token 'faultSerializer': was expecting 'null', 'true', 'false' or NaN\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@3a8582b5; line: 1, column: 17]", "stack_trace" : "com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'faultSerializer': was expecting 'null', 'true', 'false' or NaN\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@3a8582b5; line: 1, column: 17]\n\tat com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1702)\n\tat com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:558)\n\tat com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidToken(UTF8StreamJsonParser.java:3528)\n\tat com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidToken(UTF8StreamJsonParser.java:3502)\n\tat com.fasterxml.jackson.core.json.UTF8StreamJsonParser._matchToken(UTF8StreamJsonParser.java:2824)\n\tat com.fasterxml.jackson.core.json.UTF8StreamJsonParser._nextTokenNotInObject(UTF8StreamJsonParser.java:855)\n\tat com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:772)\n\tat org.elasticsearch.common.xcontent.json.JsonXContentParser.nextToken(JsonXContentParser.java:52)\n\tat org.elasticsearch.search.builder.SearchSourceBuilder.parseXContent(SearchSourceBuilder.java:1050)\n\tat org.elasticsearch.rest.action.search.RestSearchAction.parseSearchRequest(RestSearchAction.java:131)\n\tat org.elasticsearch.rest.action.search.RestSearchAction.lambda$prepareRequest$1(RestSearchAction.java:110)\n\tat org.elasticsearch.rest.RestRequest.withContentOrSourceParamParserOrNull(RestRequest.java:443)\n\tat org.elasticsearch.rest.action.search.RestSearchAction.prepareRequest(RestSearchAction.java:109)\n\tat org.elasticsearch.rest.BaseRestHandler.handleRequest(BaseRestHandler.java:92)\n\tat org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:240)\n\tat org.elasticsearch.rest.RestController.tryAllHandlers(RestController.java:337)\n\tat org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:174)\n\tat org.elasticsearch.http.AbstractHttpServerTransport.dispatchRequest(AbstractHttpServerTransport.java:317)\n\tat org.elasticsearch.http.AbstractHttpServerTransport.handleIncomingRequest(AbstractHttpServerTransport.java:367)\n\tat org.elasticsearch.http.AbstractHttpServerTransport.incomingRequest(AbstractHttpServerTransport.java:296)\n\tat org.elasticsearch.http.netty4.Netty4HttpRequestHandler.channelRead0(Netty4HttpRequestHandler.java:66)\n\tat org.elasticsearch.http.netty4.Netty4HttpRequestHandler.channelRead0(Netty4HttpRequestHandler.java:31)\n\tat io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)\n\tat org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.channelRead(Netty4HttpPipeliningHandler.java:58)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)\n\tat io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)\n\tat io.netty.handler.codec.MessageToMessageCodec.channelRead(MessageToMessageCodec.java:111)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)\n\tat io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)\n\tat io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)\n\tat io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:323)\n\tat io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:297)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)\n\tat io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)\n\tat io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)\n\tat io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965)\n\tat io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:656)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:556)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:510)\n\tat io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470)\n\tat io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909)\n\tat java.base/java.lang.Thread.run(Thread.java:835)\n" }, "status" : 500 }

@codebrain
Copy link
Contributor Author

codebrain commented Mar 26, 2020

Perfview collection of System.Buffers.ArrayPool can be undertaken with:

Perfview > Collect > Collect

Complete details as shown, setting the Additional Providers to: *System.Buffers.ArrayPoolEventSource

image

Click Start Collection

@codebrain
Copy link
Contributor Author

A further investigation into the failing test has revealed some more subtle problems.

When you run unit and integration tests it is entirely possible to get error-free runs (hence this PR!) but when re-running you will be presented with odd deserialisation errors.

Looking into those deserialisation errors, it appears as though the BinaryUtil functions are still operating on byte[] buffers that have been allocated from the pool and this is causing problems when they are returned and reused.

This issue then surfaces in the deserialisation path where you see invalid json being presented to the deserialiser. In essence this invalid json is often a partial mix of the correct response from the server, jumbled with json from a prior test result.

Obviously this is not good.

So, I think this leaves us in the following situation, and options to choose from:

Option 1

We cherry-pick the items from this PR that we deem to be safe:

  • ISO8601TimeSpanFormatter change to use local byte[] for day components
  • Removal of dead Pretty Print code
  • Prefer use of RecyclableMemoryStreamFactory.

We can pull these into the mainline and leave the "overrenting" issue in the wild. The downside being that are forcing clean-up of the ArrayPool to the internal implementation, which eventually forces a Gen2 GC.

Option 2

I have updated this PR to now completely remove the System.Buffers.ArrayPool and to allocate byte[] through the new keyword.

As a starting point I have set the buffer length to a small amount (1024 bytes). Providing these buffers don't go over 85,000 in length then they will be Gen0 collected - which is fast. Anything over that length and we are into LOH GC, but it is hard to determine if this is any worse than Option 1 in a long lived application.

Option 3

My least favorite, but we could consider bringing in the System.Buffers.ArrayPool.Shared default implementation and exposing some methods to be able to check if a byte[] buffer is rented or not and branch as appropriate.

Option 4

Attempting to bring in System.Buffers.ArrayPool alongside the UTF8Json is what is causing us problems here. In order to correctly implement the pooled memory it is going to require a fairly substantial refactoring of the codebase to ensure that byte[] operations are "pool aware", much the same way as implemented in Newtonsoft; https://github.com/JamesNK/Newtonsoft.Json/blob/6b9f467e817854532ea31e6c08abe47c53ac8b5c/Src/Newtonsoft.Json/Utilities/JavaScriptUtils.cs#L44-L76

This is a non-trivial refactoring as BinaryUtil is used in hundreds of different places.

Thoughts @russcam @Mpdreamz ?

@codebrain
Copy link
Contributor Author

Option 5

See: #4211

@russcam
Copy link
Contributor

russcam commented Mar 26, 2020

As we spoke about over zoom, my preference would be to do Option 1 now, and further research Option 2 against the current implementation, when it can be prioritized. My assumption is that it is probably better than overrenting, but it would be good to validate this assumption with metrics.

@codebrain
Copy link
Contributor Author

TODO: Cherry pick in c86f255 and determine if unit tests fail from reverting all RecyclableMemoryStreamFactory to MemoryStreamFactory

@russcam
Copy link
Contributor

russcam commented Aug 6, 2020

We discussed this again a couple of weeks ago. At this current time, we've decided not to pull in changes to RecyclableMemoryStreamFactory and byte array resizing because at last investigation, there are still places where deserialization errors occur, implying that there are further places that require investigation for this PR. In light of a desire to move to System.Text.Json, we feel it's better to focus on moving over to that.

Closing.

@russcam russcam closed this Aug 6, 2020
@Mpdreamz Mpdreamz deleted the research/memorypool-rent branch November 9, 2020 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants