Skip to content

Redefine section on sizing data nodes #90274

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Sep 30, 2022

Conversation

kingherc
Copy link
Contributor

Now that we have the estimated field mappings heap overhead in nodes stats, we can refer to them in the guide for sizing data nodes appropriately.

Relates to #86639

Now that we have the estimated field mappings heap overhead
in nodes stats, we can refer to them in the guide for sizing
data nodes appropriately.

Relates to elastic#86639
@github-actions
Copy link
Contributor

Documentation preview:

@kingherc kingherc self-assigned this Sep 22, 2022
@kingherc kingherc added >docs General docs changes :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. and removed v8.6.0 labels Sep 22, 2022
@kingherc kingherc marked this pull request as ready for review September 22, 2022 18:42
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-docs (Team:Docs)

@elasticsearchmachine elasticsearchmachine added Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. Team:Docs Meta label for docs team labels Sep 22, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of comments.

I thought we could do the same thing with the Master-eligible nodes should have at least 1GB of heap per 3000 indices section, using GET /_cluster/stats?human&filter_path=indices.mappings.total_deduplicated_mapping_size*. But now that I look again, this would work out to be much more lenient than the 3k-indices-per-GB limit we have today in most cases. I guess we're missing quite a lot of other overheads in the total_deduplicated_mapping_size stat.

@DaveCTurner
Copy link
Contributor

Ok there's still a bunch of other per-index overhead in 8.5 that should be much reduced in 8.6 so let's not worry too much about that. I would still like us to mention GET /_cluster/stats?human&filter_path=indices.mappings.total_deduplicated_mapping_size* somewhere, but perhaps it makes more sense in this field-count-recommendation section.

@kingherc
Copy link
Contributor Author

Hi @DaveCTurner , I fixed most stuff, please review.

I would still like us to mention GET /_cluster/stats?human&filter_path=indices.mappings.total_deduplicated_mapping_size* somewhere, but perhaps it makes more sense in this field-count-recommendation section.

I am not sure where to put this? Is this only for master-eligible nodes? Should I put this in the Master-eligible nodes should have at least 1GB of heap per 3000 indices section? But there's no mention of "deduplicated" mappings there. And I'm not aware of how the deduplicated mappings would translate to heap requirements -- would it be again 1KiB per deduplicated mapping?

@kingherc
Copy link
Contributor Author

Ping for reviewing

@DaveCTurner
Copy link
Contributor

Ah sorry I thought I'd replied, thanks for the ping :)

All nodes hold the cluster state (which contains mappings) so the total_deduplicated_mapping_size represents overhead everywhere. I think the field-count-recommendation section is a reasonable place to mention it.

I'm not aware of how the deduplicated mappings would translate to heap requirements

The total_deduplicated_mapping_size is exactly the heap requirement in bytes, no translation is necessary.

@kingherc kingherc added cloud-deploy Publish cloud docker image for Cloud-First-Testing and removed cloud-deploy Publish cloud docker image for Cloud-First-Testing labels Sep 29, 2022
@kingherc
Copy link
Contributor Author

@DaveCTurner , made an attempt to include the deduplicated fields size in the section. Please review. There was a mention of adding a +0.5GB extra overhead for data nodes -- now I've made it so it applies to all nodes because I think the workload overhead would be for all nodes, right? if this is not correct/wanted, please tell me and I can refine it so that the extra 0.5GB overhead should only be considered for data nodes only.

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I left some minor wording tweaks.

`mappings.total_count`, `mtc`, `mappingsTotalCount`::
Number of mappings, including <<runtime,runtime>> and <<object,object>> fields.

`mappings.total_estimated_overhead_in_bytes`, `mteoi`, `mappingsTotalEstimatedOverheadInBytes`::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this PR but why mteoi and not mteo? I think we can fix this without BwC concerns if we do so before 8.5.0 is released.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change it. Would adding a version 8.5.0 label to the PR be enough for it to be backported to 8.5.0?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs the 8.5.0 label but backporting is a separate process. I'll help on the PR itself if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, and added a label for backporting, I hope this will help, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I was expecting a separate PR for that but I see you've done it here. I suspect it doesn't matter much since we want to backport these docs too, but the code change is a little time-critical since it must land before we cut the final 8.5.0 BC (we can fix the docs later). Also it's a different kind of change, really a >non-issue since it relates to an unreleased bug and the labelling starts to get a bit confusing if we combine things like this. Still this is good to go now. Labels look good.

adequate resources for your workload and that your overall sharding strategy
meets all your performance requirements. See also <<single-thread-per-shard>>
and <<each-shard-has-overhead>>.
==== The heap of nodes should suffice for the fields, plus overheads
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
==== The heap of nodes should suffice for the fields, plus overheads
==== Allow enough heap on data nodes for field mappers and overheads

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the first section on deduplicated fields applies to all nodes, whether they are data nodes or not, right? That's why I did not mention "data nodes" in the title. Should I leave it generic or mention "data nodes"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, but left it generic (without "data node").

@DaveCTurner
Copy link
Contributor

👍 that's right, I think that applies to all nodes really.

Co-authored-by: David Turner <david.turner@elastic.co>
kingherc and others added 5 commits September 29, 2022 14:01
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: David Turner <david.turner@elastic.co>
@kingherc
Copy link
Contributor Author

@DaveCTurner please review once more, and the open conversations above

`mappings.total_count`, `mtc`, `mappingsTotalCount`::
Number of mappings, including <<runtime,runtime>> and <<object,object>> fields.

`mappings.total_estimated_overhead_in_bytes`, `mteoi`, `mappingsTotalEstimatedOverheadInBytes`::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I was expecting a separate PR for that but I see you've done it here. I suspect it doesn't matter much since we want to backport these docs too, but the code change is a little time-critical since it must land before we cut the final 8.5.0 BC (we can fix the docs later). Also it's a different kind of change, really a >non-issue since it relates to an unreleased bug and the labelling starts to get a bit confusing if we combine things like this. Still this is good to go now. Labels look good.

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple more optional minor rewording suggestions but LGTM. Your point about this applying to non-data nodes is a good one.

kingherc and others added 3 commits September 30, 2022 11:54
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: David Turner <david.turner@elastic.co>
@kingherc kingherc merged commit ad8d064 into elastic:main Sep 30, 2022
@kingherc kingherc deleted the docs/86639-explain-node-mappings branch September 30, 2022 09:37
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.5

kingherc added a commit to kingherc/elasticsearch that referenced this pull request Sep 30, 2022
Now that we have the estimated field mappings heap overhead
in nodes stats, we can refer to them in the guide for sizing
data nodes appropriately.

Relates to elastic#86639
@kingherc
Copy link
Contributor Author

@DaveCTurner can you verify the backport PR which was created is good? I see version 8.5.1 in there rather than 8.5.0. Is that fine or will it still be in 8.5.0? I wonder if any other steps are needed to backport it to 8.5.0 after the FF.

elasticsearchmachine pushed a commit that referenced this pull request Sep 30, 2022
Now that we have the estimated field mappings heap overhead
in nodes stats, we can refer to them in the guide for sizing
data nodes appropriately.

Relates to #86639
@DaveCTurner
Copy link
Contributor

Yes all good we fix up the labels after release to reflect the set of PRs that were actually included.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. >docs General docs changes >refactoring Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. Team:Docs Meta label for docs team v8.5.0 v8.6.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants