DOCS: general overview of data tiers and roles #63086

andreidan · 2020-09-30T16:15:01Z

This adds a general overview documentation for data tiers
and the data tiers specific node roles.

Relates to #60848

elasticmachine · 2020-09-30T16:15:03Z

Pinging @elastic/es-core-features (:Core/Features/Features)

elasticmachine · 2020-09-30T16:15:03Z

Pinging @elastic/es-docs (>docs)

dakrone

Thanks for opening this Andrei! I left a bunch of comments and hopefully someone from the docs team can weigh in as well

docs/reference/index-modules/allocation/data_tier_allocation.asciidoc

docs/reference/index-modules/allocation/filtering.asciidoc

docs/reference/modules/cluster/allocation_filtering.asciidoc

dakrone · 2020-09-30T19:48:06Z

docs/reference/modules/datatiers.asciidoc

+Common data lifecycle management patterns revolve around transitioning the indices
+through multiple collections of nodes with different hardware characteristics in order
+to fulfil evolving CRUD, search, and aggregation needs as the indices age. The concept
+of a tiered hardware architecture is not new in {es}.


(Only my suggestion, not necessarily a requirement)

Suggested change

Common data lifecycle management patterns revolve around transitioning the indices

through multiple collections of nodes with different hardware characteristics in order

to fulfil evolving CRUD, search, and aggregation needs as the indices age. The concept

of a tiered hardware architecture is not new in {es}.

Common data lifecycle management patterns revolve around transitioning indices

through multiple collections of nodes with different hardware characteristics in order

to fulfil evolving CRUD, search, and aggregation needs as indices age.

The reason I removed the comment about the "not new" section is I think we could/should explicitly add a section about migrating attribute based transitioning to data tier transitioning, perhaps elsewhere or as a blog post?

That's a great point Lee. I believe the ILM section should advise on how to migrate. That said, I think mentioning/referencing the existing ILM tiered options/methods here is a nice bridge for that (with links going back and forth between the ILM guide and this page).

I'm happy to drop it but I find it a nice bridge towards ILM and the tiered options it enables (with and without data tiers)

I've reworded the tiers definition to emphasise things like replicas etc should be configured and don't come as guarantees. Also reworded the data retention a bit to be a guideline.

Let me know if we should reword /remove more.

docs/reference/modules/datatiers.asciidoc

dakrone · 2020-09-30T19:58:59Z

docs/reference/modules/datatiers.asciidoc

+is retained for months and the indices have zero replicas as they are backed by a searchable
+snapshot.


I definitely think this sentence should not be here, as it makes it sound like all of this happens automatically when data is moved to the cold tier

docs/reference/modules/datatiers.asciidoc

docs/reference/modules/node.asciidoc

Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>

andreidan · 2020-10-02T09:09:22Z

@elasticmachine update branch

debadair

Left several comments & suggestions. Let me know if you have questions or want to discuss.

debadair · 2020-10-02T21:01:24Z

docs/reference/ilm/actions/ilm-migrate.asciidoc

+Updates the <<tier-preference-allocation-filter, `index.routing.allocation.include._tier_preference`>>
+index setting in order to migrate the index to the <<modules-tiers, data tier>> corresponding
+to the current phase.


Suggested change

Updates the <<tier-preference-allocation-filter, `index.routing.allocation.include._tier_preference`>>

index setting in order to migrate the index to the <<modules-tiers, data tier>> corresponding

to the current phase.

Moves the index to the <<modules-tiers, data tier>> that corresponds

to the current phase by updating the <<tier-preference-allocation-filter, `index.routing.allocation.include._tier_preference`>>

index setting.

{ilm-init} automatically injects the migrate action in the warm and cold

phases if no allocation options are specified with the <<ilm-allocate, allocate>> action. If you specify an allocate action that only modifies the number of index

replicas, {ilm-init} reduces the number of replicas before migrating the index.

To prevent automatic migration without specifying allocation options,

you can explicitly include the migrate action and set the enabled option to`false`.

docs/reference/ilm/actions/ilm-migrate.asciidoc

debadair · 2020-10-02T23:59:55Z

docs/reference/modules/node.asciidoc

+Content data nodes accommodate user-created content. They enable operations like CRUD,
+search and aggregations.


I think we need a better definition of content node. Defining it in terms of "user-created content" could be interpreted as actual user-generated content, not content like a product catalog. I was trying to define it in terms of "collections of things" vs a stream of data. Maybe something like "Content data nodes store indices that contain collections of things such as an catalog of products. The value of the data in a content node remains relatively constant, and the performance requirements aren't tied to the age of the data."

I think introducing more abstract terms could potentially complicate things further here. I believe the product catalog would usually be manually introduced in the system (ie. user created) as opposed to being machine generated (like logs and metrics).

I wonder if it would be clearer if we talk about "content" by exemplifying it as opposed to using the content origin?

eg. Content data nodes store the documents that back/support application, website, and enterprise search. The value of the data in a content node remains relatively constant, and the performance requirements aren't tied to the age of the data.

docs/reference/modules/node.asciidoc

debadair · 2020-10-03T00:03:44Z

docs/reference/modules/node.asciidoc

+Warm data nodes hold indices after they are no longer being written to, but still being
+queried, usually at a lower frequency than it was in the hot tier. Lower performant
+hardware can usually be used in this tier.


Suggested change

Warm data nodes hold indices after they are no longer being written to, but still being

queried, usually at a lower frequency than it was in the hot tier. Lower performant

hardware can usually be used in this tier.

Warm data nodes store indices that are no longer being regularly updated, but are still being

queried. Query volume is usually at a lower than it was while the index was in the hot tier. Less performant

hardware can usually be used for nodes in this tier.

docs/reference/modules/node.asciidoc

debadair · 2020-10-03T00:08:39Z

docs/reference/setup.asciidoc

@@ -79,6 +79,8 @@ include::settings/monitoring-settings.asciidoc[]

 include::modules/node.asciidoc[]

+include::modules/datatiers.asciidoc[]


Per previous comment, I think we want this info at the top level.

Co-authored-by: debadair <debadair@elastic.co>

dakrone

This looks much better, thanks for working on this! I left a bunch of comments still, but they are really minor. Deb should take another look before merging also.

docs/reference/datatiers.asciidoc

docs/reference/ilm/actions/ilm-migrate.asciidoc

docs/reference/modules/node.asciidoc

Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>

andreidan · 2020-10-06T11:00:12Z

Thanks for the review @dakrone

This adds general overview documentation for data tiers, the data tiers specific node roles, and their application in ILM. Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: debadair <debadair@elastic.co> (cherry picked from commit d588cab) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>

This adds general overview documentation for data tiers, the data tiers specific node roles, and their application in ILM. Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: debadair <debadair@elastic.co> (cherry picked from commit d588cab) Signed-off-by: Andrei Dan <andrei.dan@elastic.co> Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: debadair <debadair@elastic.co>

This adds general overview documentation for data tiers, the data tiers specific node roles, and their application in ILM. Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: debadair <debadair@elastic.co> (cherry picked from commit d588cab) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>

DOCS: general overview of data tiers and roles

2659c13

andreidan added >docs General docs changes :Core/Features/Features v8.0.0 v7.10.0 labels Sep 30, 2020

elasticmachine added Team:Data Management Meta label for data/management team Team:Docs Meta label for docs team labels Sep 30, 2020

andreidan added 2 commits September 30, 2020 17:38

Mention the data role

8dd4a37

Add _tier attribute filtering documentation

17eb22e

andreidan added the WIP label Sep 30, 2020

andreidan mentioned this pull request Sep 30, 2020

Formalize the concept of data tiers in Elasticsearch #60848

Closed

18 tasks

andreidan added 3 commits September 30, 2020 18:20

Document index level data tier routing

856da8b

Cluster allocation attribute correction

6c56db1

Add more information regarding index allocation

3a6107a

andreidan removed the WIP label Sep 30, 2020

andreidan requested review from debadair and dakrone September 30, 2020 17:55

dakrone requested changes Sep 30, 2020

View reviewed changes

andreidan and others added 4 commits October 1, 2020 11:07

Apply a batch of suggestions

19411ee

Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>

Reword

adce708

Remove tradeoffs mentions

b2c5cc5

Reword new index allocation to tiers

b12ad0d

elasticmachine and others added 3 commits October 2, 2020 05:09

Merge branch 'master' into doc-data-tiers

102d200

Document the migrate ILM action.

618afbd

Reword tiers

91d8dd0

andreidan requested a review from dakrone October 2, 2020 13:36

debadair suggested changes Oct 3, 2020

View reviewed changes

andreidan and others added 6 commits October 5, 2020 09:27

Apply suggestions from code review

d26b283

Co-authored-by: debadair <debadair@elastic.co>

Rename modules-tiers to data-tiers

835203a

Co-authored-by: debadair <debadair@elastic.co>

Data tiers is top level section

5f64d87

Migrate action reword

9ddb586

Reword data_warm and node roles

27e09c5

Reword _tier allocation spec

323c070

andreidan requested a review from debadair October 5, 2020 12:48

"values" word

8b68bfb

dakrone approved these changes Oct 5, 2020

View reviewed changes

Apply suggestions from code review

79d1472

Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>

andreidan added v7.11.0 v7.10.0 and removed v7.10.0 labels Oct 7, 2020

andreidan merged commit d588cab into elastic:master Oct 7, 2020

andreidan mentioned this pull request Oct 7, 2020

[7.x] DOCS: general overview of data tiers and roles (#63086) #63421

Merged

andreidan mentioned this pull request Oct 7, 2020

[7.10] DOCS: general overview of data tiers and roles (#63086) #63422

Merged

andreidan added the backport pending label Oct 7, 2020

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

		is retained for months and the indices have zero replicas as they are backed by a searchable
		snapshot.

-Updates the <<tier-preference-allocation-filter, `index.routing.allocation.include._tier_preference`>>
-index setting in order to migrate the index to the <<modules-tiers, data tier>> corresponding
-to the current phase.
+Moves the index to the <<modules-tiers, data tier>> that corresponds
+to the current phase by updating the <<tier-preference-allocation-filter, `index.routing.allocation.include._tier_preference`>>
+index setting.
+{ilm-init} automatically injects the migrate action in the warm and cold
+phases if no allocation options are specified with the <<ilm-allocate, allocate>> action. If you specify an allocate action that only modifies the number of index
+replicas, {ilm-init} reduces the number of replicas before migrating the index.
+To prevent automatic migration without specifying allocation options,
+you can explicitly include the migrate action and set the enabled option to`false`.

		Content data nodes accommodate user-created content. They enable operations like CRUD,
		search and aggregations.

		@@ -79,6 +79,8 @@ include::settings/monitoring-settings.asciidoc[]

		include::modules/node.asciidoc[]

		include::modules/datatiers.asciidoc[]

DOCS: general overview of data tiers and roles #63086

DOCS: general overview of data tiers and roles #63086

Uh oh!

Conversation

andreidan commented Sep 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticmachine commented Sep 30, 2020

Uh oh!

elasticmachine commented Sep 30, 2020

Uh oh!

dakrone left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

andreidan commented Oct 2, 2020

Uh oh!

debadair left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dakrone left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

andreidan commented Oct 6, 2020

Uh oh!

Uh oh!

andreidan commented Sep 30, 2020 •

edited

Loading