Skip to content

Reduce Heap Use during Shard Snapshot #60370

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

original-brownbear
Copy link
Contributor

@original-brownbear original-brownbear commented Jul 29, 2020

Instances of BlobStoreIndexShardSnapshots can be of non-trivial size (because they store some segment files directly as raw bytes). In case of snapshotting a larger number of shards the previous execution order would lead to memory use proportional to the number of shards for these objects. With this change, the number of these objects on heap is bounded by the size of the snapshot pool (except for in the BwC format path).
This PR makes it so that they are written to the repository at the earliest possible point in time so that they can be garbage collected.
If shard generations are used, we can safely write these right at the beginning of the shard snapshot.
If shard generations are not used we can only write them at the end of the shard snapshot after all
other blobs have been written but with this change at least the initial version of BlobStoreIndexShardSnapshots read from the repository goes out of scope right away which should improve this case a little as well.

NOTE: in other spots where we are handling collections of these blobs the issue of their size is less relevant because we always map them to something cheaper (like a list of blob names during deletes) right after loading them. This is the only spot where we could have very long lived instances of these objects.

Closes #60173

Instances of `BlobStoreIndexShardSnapshots` can be of non-trivial size. In case of snapshotting a larger
number of shards the previous execution order would lead to memory use proportional to the number of shards
for these objects. With this change, the number of these objects on heap is bounded by the size of the snapshot
pool (except for in the BwC format path).
This PR makes it so that they are written to the repository at the earliest possible point in time
so that they can be garbage collected.
If shard generations are used, we can safely write these right at the beginning of the shard snapshot.
If shard generations are not used we can only write them at the end of the shard snapshot after all
other blobs have been written.

Closes elastic#60173
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jul 29, 2020
@ywelsch
Copy link
Contributor

ywelsch commented Jul 29, 2020

We wanted to bound the number of concurrent snapshots a data node can take anyway (typically bounded by the max size of the threadpool), so I wonder if that more general change would be sufficient to address this, and not require introduce more complexity in this part of the code.

@original-brownbear
Copy link
Contributor Author

so I wonder if that more general change would be sufficient to address this, and not require introduce more complexity in this part of the code.

It would help as well but this change helps more and even if we bound the number of snapshots would be helpful in reducing memory use on data nodes.
I don't really find this change to up complexity that much. Yes it changes the order of things, but it also turns multiple if(shardGenerations) into a single one :)

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only see this as a short-term fix, with the long-term fix being to reduce the level of parallel snapshot operations on a node.

@original-brownbear
Copy link
Contributor Author

original-brownbear commented Jul 30, 2020

Thanks Yannick

I only see this as a short-term fix, with the long-term fix being to reduce the level of parallel snapshot operations on a node.

++ I think I have an idea on how to achieve that as well with little changes, will look into it soon.

@original-brownbear original-brownbear merged commit d561686 into elastic:master Jul 30, 2020
@original-brownbear original-brownbear deleted the improve-memory-use-blobstore-repository branch July 30, 2020 07:41
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Jul 30, 2020
Instances of `BlobStoreIndexShardSnapshots` can be of non-trivial size. In case of snapshotting a larger
number of shards the previous execution order would lead to memory use proportional to the number of shards
for these objects. With this change, the number of these objects on heap is bounded by the size of the snapshot
pool (except for in the BwC format path).
This PR makes it so that they are written to the repository at the earliest possible point in time
so that they can be garbage collected.
If shard generations are used, we can safely write these right at the beginning of the shard snapshot.
If shard generations are not used we can only write them at the end of the shard snapshot after all
other blobs have been written.

Closes elastic#60173
original-brownbear added a commit that referenced this pull request Jul 30, 2020
Instances of `BlobStoreIndexShardSnapshots` can be of non-trivial size. In case of snapshotting a larger
number of shards the previous execution order would lead to memory use proportional to the number of shards
for these objects. With this change, the number of these objects on heap is bounded by the size of the snapshot
pool (except for in the BwC format path).
This PR makes it so that they are written to the repository at the earliest possible point in time
so that they can be garbage collected.
If shard generations are used, we can safely write these right at the beginning of the shard snapshot.
If shard generations are not used we can only write them at the end of the shard snapshot after all
other blobs have been written.

Closes #60173
@original-brownbear original-brownbear restored the improve-memory-use-blobstore-repository branch August 6, 2020 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >non-issue Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.10.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Elastic Snapshots cause out of memory exceptions
4 participants