Redefine section on sizing data nodes #90274

kingherc · 2022-09-22T18:03:13Z

Now that we have the estimated field mappings heap overhead in nodes stats, we can refer to them in the guide for sizing data nodes appropriately.

Relates to #86639

Now that we have the estimated field mappings heap overhead in nodes stats, we can refer to them in the guide for sizing data nodes appropriately. Relates to elastic#86639

github-actions · 2022-09-22T18:03:26Z

Documentation preview:

✨ Changed pages

elasticsearchmachine · 2022-09-22T18:43:12Z

Pinging @elastic/es-docs (Team:Docs)

elasticsearchmachine · 2022-09-22T18:43:13Z

Pinging @elastic/es-distributed (Team:Distributed)

DaveCTurner

I left a couple of comments.

I thought we could do the same thing with the Master-eligible nodes should have at least 1GB of heap per 3000 indices section, using GET /_cluster/stats?human&filter_path=indices.mappings.total_deduplicated_mapping_size*. But now that I look again, this would work out to be much more lenient than the 3k-indices-per-GB limit we have today in most cases. I guess we're missing quite a lot of other overheads in the total_deduplicated_mapping_size stat.

docs/reference/how-to/size-your-shards.asciidoc

DaveCTurner · 2022-09-23T09:08:50Z

Ok there's still a bunch of other per-index overhead in 8.5 that should be much reduced in 8.6 so let's not worry too much about that. I would still like us to mention GET /_cluster/stats?human&filter_path=indices.mappings.total_deduplicated_mapping_size* somewhere, but perhaps it makes more sense in this field-count-recommendation section.

docs/reference/how-to/size-your-shards.asciidoc

kingherc · 2022-09-23T10:55:18Z

Hi @DaveCTurner , I fixed most stuff, please review.

I would still like us to mention GET /_cluster/stats?human&filter_path=indices.mappings.total_deduplicated_mapping_size* somewhere, but perhaps it makes more sense in this field-count-recommendation section.

I am not sure where to put this? Is this only for master-eligible nodes? Should I put this in the Master-eligible nodes should have at least 1GB of heap per 3000 indices section? But there's no mention of "deduplicated" mappings there. And I'm not aware of how the deduplicated mappings would translate to heap requirements -- would it be again 1KiB per deduplicated mapping?

kingherc · 2022-09-28T08:31:58Z

Ping for reviewing

DaveCTurner · 2022-09-28T09:05:14Z

Ah sorry I thought I'd replied, thanks for the ping :)

All nodes hold the cluster state (which contains mappings) so the total_deduplicated_mapping_size represents overhead everywhere. I think the field-count-recommendation section is a reasonable place to mention it.

I'm not aware of how the deduplicated mappings would translate to heap requirements

The total_deduplicated_mapping_size is exactly the heap requirement in bytes, no translation is necessary.

kingherc · 2022-09-29T10:36:05Z

@DaveCTurner , made an attempt to include the deduplicated fields size in the section. Please review. There was a mention of adding a +0.5GB extra overhead for data nodes -- now I've made it so it applies to all nodes because I think the workload overhead would be for all nodes, right? if this is not correct/wanted, please tell me and I can refine it so that the extra 0.5GB overhead should only be considered for data nodes only.

DaveCTurner

Looks good, I left some minor wording tweaks.

DaveCTurner · 2022-09-29T10:40:48Z

docs/reference/cat/nodes.asciidoc

+`mappings.total_count`, `mtc`, `mappingsTotalCount`::
+Number of mappings, including <<runtime,runtime>> and <<object,object>> fields.
+
+`mappings.total_estimated_overhead_in_bytes`, `mteoi`, `mappingsTotalEstimatedOverheadInBytes`::


Unrelated to this PR but why mteoi and not mteo? I think we can fix this without BwC concerns if we do so before 8.5.0 is released.

I can change it. Would adding a version 8.5.0 label to the PR be enough for it to be backported to 8.5.0?

It needs the 8.5.0 label but backporting is a separate process. I'll help on the PR itself if needed.

Fixed, and added a label for backporting, I hope this will help, right?

Ah I was expecting a separate PR for that but I see you've done it here. I suspect it doesn't matter much since we want to backport these docs too, but the code change is a little time-critical since it must land before we cut the final 8.5.0 BC (we can fix the docs later). Also it's a different kind of change, really a >non-issue since it relates to an unreleased bug and the labelling starts to get a bit confusing if we combine things like this. Still this is good to go now. Labels look good.

DaveCTurner · 2022-09-29T10:42:37Z

docs/reference/how-to/size-your-shards.asciidoc

-adequate resources for your workload and that your overall sharding strategy
-meets all your performance requirements. See also <<single-thread-per-shard>>
-and <<each-shard-has-overhead>>.
+==== The heap of nodes should suffice for the fields, plus overheads


Suggested change

==== The heap of nodes should suffice for the fields, plus overheads

==== Allow enough heap on data nodes for field mappers and overheads

I thought the first section on deduplicated fields applies to all nodes, whether they are data nodes or not, right? That's why I did not mention "data nodes" in the title. Should I leave it generic or mention "data nodes"?

Fixed, but left it generic (without "data node").

docs/reference/how-to/size-your-shards.asciidoc

DaveCTurner · 2022-09-29T10:51:02Z

👍 that's right, I think that applies to all nodes really.

Co-authored-by: David Turner <david.turner@elastic.co>

kingherc · 2022-09-29T11:09:18Z

@DaveCTurner please review once more, and the open conversations above

docs/reference/how-to/size-your-shards.asciidoc

DaveCTurner · 2022-09-30T07:36:03Z

docs/reference/cat/nodes.asciidoc

+`mappings.total_count`, `mtc`, `mappingsTotalCount`::
+Number of mappings, including <<runtime,runtime>> and <<object,object>> fields.
+
+`mappings.total_estimated_overhead_in_bytes`, `mteoi`, `mappingsTotalEstimatedOverheadInBytes`::


Ah I was expecting a separate PR for that but I see you've done it here. I suspect it doesn't matter much since we want to backport these docs too, but the code change is a little time-critical since it must land before we cut the final 8.5.0 BC (we can fix the docs later). Also it's a different kind of change, really a >non-issue since it relates to an unreleased bug and the labelling starts to get a bit confusing if we combine things like this. Still this is good to go now. Labels look good.

DaveCTurner

A couple more optional minor rewording suggestions but LGTM. Your point about this applying to non-data nodes is a good one.

Co-authored-by: David Turner <david.turner@elastic.co>

elasticsearchmachine · 2022-09-30T09:38:51Z

💚 Backport successful

Status	Branch	Result
✅	8.5

Now that we have the estimated field mappings heap overhead in nodes stats, we can refer to them in the guide for sizing data nodes appropriately. Relates to elastic#86639

kingherc · 2022-09-30T09:53:20Z

@DaveCTurner can you verify the backport PR which was created is good? I see version 8.5.1 in there rather than 8.5.0. Is that fine or will it still be in 8.5.0? I wonder if any other steps are needed to backport it to 8.5.0 after the FF.

Now that we have the estimated field mappings heap overhead in nodes stats, we can refer to them in the guide for sizing data nodes appropriately. Relates to #86639

DaveCTurner · 2022-09-30T10:27:57Z

Yes all good we fix up the labels after release to reflect the set of PRs that were actually included.

Redefine section on sizing data nodes

d8358a4

Now that we have the estimated field mappings heap overhead in nodes stats, we can refer to them in the guide for sizing data nodes appropriately. Relates to elastic#86639

kingherc self-assigned this Sep 22, 2022

elasticsearchmachine added the v8.6.0 label Sep 22, 2022

kingherc added >docs General docs changes :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. and removed v8.6.0 labels Sep 22, 2022

kingherc marked this pull request as ready for review September 22, 2022 18:42

kingherc requested a review from DaveCTurner September 22, 2022 18:42

elasticsearchmachine added Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. Team:Docs Meta label for docs team labels Sep 22, 2022

kingherc added the v8.6.0 label Sep 22, 2022

DaveCTurner reviewed Sep 23, 2022

View reviewed changes

docs/reference/how-to/size-your-shards.asciidoc Outdated Show resolved Hide resolved

docs/reference/how-to/size-your-shards.asciidoc Outdated Show resolved Hide resolved

DaveCTurner reviewed Sep 23, 2022

View reviewed changes

docs/reference/how-to/size-your-shards.asciidoc Outdated Show resolved Hide resolved

kingherc added 2 commits September 23, 2022 13:54

Fix missing documentation

8a34036

Fix PR comments

072af07

kingherc requested a review from DaveCTurner September 23, 2022 10:55

kingherc added cloud-deploy Publish cloud docker image for Cloud-First-Testing and removed cloud-deploy Publish cloud docker image for Cloud-First-Testing labels Sep 29, 2022

Expand to mention cluster state field overheads

840cbef

DaveCTurner reviewed Sep 29, 2022

View reviewed changes

Update docs/reference/how-to/size-your-shards.asciidoc

84db1e5

Co-authored-by: David Turner <david.turner@elastic.co>

kingherc and others added 5 commits September 29, 2022 14:01

Update docs/reference/how-to/size-your-shards.asciidoc

bac0fc4

Co-authored-by: David Turner <david.turner@elastic.co>

Update docs/reference/how-to/size-your-shards.asciidoc

68230b5

Co-authored-by: David Turner <david.turner@elastic.co>

Update docs/reference/how-to/size-your-shards.asciidoc

a8cd5f6

Co-authored-by: David Turner <david.turner@elastic.co>

PR recommendations

fb39339

Fix levels

e0be892

kingherc added v8.5.0 auto-backport-and-merge labels Sep 29, 2022

Merge branch 'main' into docs/86639-explain-node-mappings

e4d2ee5

kingherc added the >refactoring label Sep 29, 2022

kingherc requested a review from DaveCTurner September 29, 2022 12:49

DaveCTurner reviewed Sep 30, 2022

View reviewed changes

DaveCTurner approved these changes Sep 30, 2022

View reviewed changes

kingherc and others added 3 commits September 30, 2022 11:54

Update docs/reference/how-to/size-your-shards.asciidoc

37553ce

Co-authored-by: David Turner <david.turner@elastic.co>

Update docs/reference/how-to/size-your-shards.asciidoc

8ac67c7

Co-authored-by: David Turner <david.turner@elastic.co>

Update docs/reference/how-to/size-your-shards.asciidoc

26e8e43

Co-authored-by: David Turner <david.turner@elastic.co>

kingherc merged commit ad8d064 into elastic:main Sep 30, 2022

kingherc deleted the docs/86639-explain-node-mappings branch September 30, 2022 09:37

kingherc mentioned this pull request Sep 30, 2022

[8.5] Redefine section on sizing data nodes (#90274) #90550

Merged

kingherc mentioned this pull request Oct 3, 2022

Report stats related to new sizing guidance #86639

Closed

3 tasks

DaveCTurner mentioned this pull request Jul 11, 2023

Add shard_stats.total_count column description to /_cat/nodes docs. #97549

Merged

	==== The heap of nodes should suffice for the fields, plus overheads
	==== Allow enough heap on data nodes for field mappers and overheads

Redefine section on sizing data nodes #90274

Redefine section on sizing data nodes #90274

Uh oh!

Conversation

kingherc commented Sep 22, 2022

Uh oh!

github-actions bot commented Sep 22, 2022

Uh oh!

elasticsearchmachine commented Sep 22, 2022

Uh oh!

elasticsearchmachine commented Sep 22, 2022

Uh oh!

DaveCTurner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

DaveCTurner commented Sep 23, 2022

Uh oh!

Uh oh!

kingherc commented Sep 23, 2022

Uh oh!

kingherc commented Sep 28, 2022

Uh oh!

DaveCTurner commented Sep 28, 2022

Uh oh!

kingherc commented Sep 29, 2022

Uh oh!

DaveCTurner left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DaveCTurner commented Sep 29, 2022

Uh oh!

kingherc commented Sep 29, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DaveCTurner left a comment

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Sep 30, 2022

💚 Backport successful

Uh oh!

kingherc commented Sep 30, 2022

Uh oh!

DaveCTurner commented Sep 30, 2022

Uh oh!

Uh oh!