|  | 
|  | 1 | +--- | 
|  | 2 | +layout: default | 
|  | 3 | +title: Star Tree index | 
|  | 4 | +parent: Improving search performance | 
|  | 5 | +nav_order: 54 | 
|  | 6 | +--- | 
|  | 7 | + | 
|  | 8 | +# Star Tree index | 
|  | 9 | + | 
|  | 10 | +This is an experimental feature and is not recommended for use in a production environment. For updates on the progress the feature or if you want to leave feedback, join the discussion on the [OpenSearch forum](https://forum.opensearch.org/).     | 
|  | 11 | +{: .warning} | 
|  | 12 | + | 
|  | 13 | +Star Tree Index is a multi-field index that improves the performance of aggregations. | 
|  | 14 | + | 
|  | 15 | +OpenSearch will use the star-tree index to optimize aggregations based on the input query and star-tree configuration. No changes are required in the query syntax or requests. | 
|  | 16 | + | 
|  | 17 | +## Star Tree index structure | 
|  | 18 | + | 
|  | 19 | +<img src="{{site.url}}{{site.baseurl}}/images/star-tree-index.png" alt="A Star Tree index containing two dimensions and two metrics" width="700"> | 
|  | 20 | + | 
|  | 21 | +Star Tree index structure as portrayed in the above figure, consists of mainly two parts: Star Tree and sorted and aggregated star-tree documents backed by doc-values indices. | 
|  | 22 | + | 
|  | 23 | +Each node in the Star Tree points to a range of star-tree documents. | 
|  | 24 | +A node is further split into child nodes based on maxLeafDocs configuration. | 
|  | 25 | +The number of documents a leaf node points to is than or equal to maxLeafDocs. This ensures the maximum number of documents that gets traversed to get to the aggregated value is at most maxLeafDocs, thus providing predictable latencies. | 
|  | 26 | + | 
|  | 27 | +There are special nodes called `star nodes (*)` which helps in skipping non-competitive nodes and also in fetching aggregated document wherever applicable during query time. | 
|  | 28 | + | 
|  | 29 | +The figure contains three examples explaining the Star Tree traversal during query:  | 
|  | 30 | +- Compute average request size aggregation with Terms query where port equals 8443 and status equals 200 (Support for Terms query will be added in upcoming release) | 
|  | 31 | +- Compute count of requests aggregation with Term query where status equals 200 (query traverses via * node of `port` dimension since `port` is not present as part of query)  | 
|  | 32 | +- Compute average request size aggregation with Term query where port equals 5600 (query traverses via * node of `status` dimension since `status` is not present as part of query).  | 
|  | 33 | +<br/>The second and third examples uses star nodes. | 
|  | 34 | + | 
|  | 35 | + | 
|  | 36 | +## When to use Star Tree index | 
|  | 37 | +You can be use Star Tree index to perform faster aggregations with a constant upper bound on query latency. | 
|  | 38 | +- Star Tree natively supports multi field aggregations | 
|  | 39 | +- Star Tree index will be created in real time as part of regular indexing, so the data in Star Tree will always be up to date with the live data. | 
|  | 40 | +- Star Tree index consolidates the data and hence is a storage efficient index which results in efficient paging and fraction of IO utilization for search queries.  | 
|  | 41 | + | 
|  | 42 | +## Considerations | 
|  | 43 | +- Star Tree index ideally should be used with append-only indices, as updates or deletes are not accounted in Star Tree index. | 
|  | 44 | +- Star Tree index will be used for aggregation queries only if the query input is a subset of the Star Tree configuration of dimensions and metrics | 
|  | 45 | +- Once star-tree index is enabled for an index, you currently cannot disable it. You have to reindex without the star-tree mapping to remove star-tree from the index. | 
|  | 46 | +    - Changing Star Tree configuration will also require a re-index operation. | 
|  | 47 | +- [Multi-values/array values]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/index/#arrays) are not supported | 
|  | 48 | +- Only [limited queries and aggregations](#supported-query-and-aggregations) are supported with support for more coming in future | 
|  | 49 | +- The cardinality of the dimensions should not be very high (like _id fields), otherwise it leads to storage explosion and higher query latencies. | 
|  | 50 | + | 
|  | 51 | +## Enabling Star Tree index | 
|  | 52 | +- Set the feature flag `opensearch.experimental.feature.composite_index.star_tree.enabled"` to `true`. For more information about enabling and disabling feature flags, see [Enabling experimental features]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/experimental/). | 
|  | 53 | +- Set the `indices.composite_index.star_tree.enabled` setting to `true`. For instructions on how to configure OpenSearch, see [configuring settings]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/index/#static-settings). | 
|  | 54 | +- Set the `index.composite_index` index setting to `true` during index creation. | 
|  | 55 | + | 
|  | 56 | +## Examples | 
|  | 57 | + | 
|  | 58 | +The following examples show how to use star-tree index. | 
|  | 59 | + | 
|  | 60 | +### Defining Star Tree index in mappings | 
|  | 61 | + | 
|  | 62 | +Define star-tree configuration in index mappings when creating an index. <br/> | 
|  | 63 | +To create star-tree index to pre-compute aggregations for `request_size` and `latency` fields for all the combinations of values in `port` and `status` fields indexed in the `logs` index, configure the following mapping: | 
|  | 64 | + | 
|  | 65 | +```json | 
|  | 66 | +PUT logs | 
|  | 67 | +{ | 
|  | 68 | +  "settings": { | 
|  | 69 | +    "index.number_of_shards": 1, | 
|  | 70 | +    "index.number_of_replicas": 0, | 
|  | 71 | +    "index.composite_index": true | 
|  | 72 | +  }, | 
|  | 73 | +  "mappings": { | 
|  | 74 | +    "composite": { | 
|  | 75 | +      "startree1": { | 
|  | 76 | +        "type": "star_tree", | 
|  | 77 | +        "config": { | 
|  | 78 | +          "ordered_dimensions": [ | 
|  | 79 | +            { | 
|  | 80 | +              "name": "status" | 
|  | 81 | +            }, | 
|  | 82 | +            { | 
|  | 83 | +              "name": "port" | 
|  | 84 | +            } | 
|  | 85 | +          ], | 
|  | 86 | +          "metrics": [ | 
|  | 87 | +            { | 
|  | 88 | +              "name": "request_size", | 
|  | 89 | +              "stats": [ | 
|  | 90 | +                "sum", | 
|  | 91 | +                "value_count", | 
|  | 92 | +                "min", | 
|  | 93 | +                "max" | 
|  | 94 | +              ], | 
|  | 95 | +              "name": "latency", | 
|  | 96 | +              "stats": [ | 
|  | 97 | +                "sum", | 
|  | 98 | +                "value_count", | 
|  | 99 | +                "min", | 
|  | 100 | +                "max" | 
|  | 101 | +              ] | 
|  | 102 | +            } | 
|  | 103 | +          ] | 
|  | 104 | +        } | 
|  | 105 | +      } | 
|  | 106 | +    }, | 
|  | 107 | +    "properties": { | 
|  | 108 | +      "status": { | 
|  | 109 | +        "type": "integer" | 
|  | 110 | +      }, | 
|  | 111 | +      "port": { | 
|  | 112 | +        "type": "integer" | 
|  | 113 | +      }, | 
|  | 114 | +      "request_size": { | 
|  | 115 | +        "type": "integer" | 
|  | 116 | +      }, | 
|  | 117 | +      "latency": { | 
|  | 118 | +        "type": "scaled_float", | 
|  | 119 | +        "scaling_factor": 10 | 
|  | 120 | +      } | 
|  | 121 | +    } | 
|  | 122 | +  } | 
|  | 123 | +} | 
|  | 124 | +``` | 
|  | 125 | + | 
|  | 126 | +For detailed information about Star Tree index mapping and parameters see [Star Tree field type]({{site.url}}{{site.baseurl}}/field-types/star-tree/). | 
|  | 127 | + | 
|  | 128 | +## Supported query and aggregations | 
|  | 129 | + | 
|  | 130 | +Star Tree index can be used to optimize aggregations for selected set of queries with support for more coming in upcoming releases. | 
|  | 131 | + | 
|  | 132 | +### Supported queries | 
|  | 133 | +Ensure the following in star tree index mapping, | 
|  | 134 | +- The fields present in the query must be present as part of `ordered_dimensions` as part of star-tree configuration. | 
|  | 135 | + | 
|  | 136 | +The following queries are supported [ when supported aggregations are specified ] <br/> | 
|  | 137 | + | 
|  | 138 | +- [Term query](https://opensearch.org/docs/latest/query-dsl/term/term/) | 
|  | 139 | +- [Match all docs query](https://opensearch.org/docs/latest/query-dsl/match-all/) | 
|  | 140 | + | 
|  | 141 | +### Supported aggregations | 
|  | 142 | +Ensure the following in star tree index mapping, | 
|  | 143 | +- The fields present in the aggregation must be present as part of `metrics` as part of star-tree configuration. | 
|  | 144 | +- The metric aggregation type must be part of `stats` parameter. | 
|  | 145 | +  | 
|  | 146 | +Following metric aggregations are supported. | 
|  | 147 | +- [Sum](https://opensearch.org/docs/latest/aggregations/metric/sum/) | 
|  | 148 | +- [Minimum](https://opensearch.org/docs/latest/aggregations/metric/minimum/) | 
|  | 149 | +- [Maximum](https://opensearch.org/docs/latest/aggregations/metric/maximum/) | 
|  | 150 | +- [Value count](https://opensearch.org/docs/latest/aggregations/metric/value-count/) | 
|  | 151 | +- [Average](https://opensearch.org/docs/latest/aggregations/metric/average/) | 
|  | 152 | + | 
|  | 153 | +### Examples | 
|  | 154 | +To get sum of `request_size` for all error logs with `status=500` with the [example mapping](#defining-star-tree-index-in-mappings) : | 
|  | 155 | +```json | 
|  | 156 | +POST /logs/_search | 
|  | 157 | +{ | 
|  | 158 | +  "query": { | 
|  | 159 | +    "term": { | 
|  | 160 | +      "status": "500" | 
|  | 161 | +    } | 
|  | 162 | +  }, | 
|  | 163 | +  "aggs": { | 
|  | 164 | +    "sum_request_size": { | 
|  | 165 | +      "sum": { | 
|  | 166 | +        "field": "request_size" | 
|  | 167 | +      } | 
|  | 168 | +    } | 
|  | 169 | +  } | 
|  | 170 | +} | 
|  | 171 | +``` | 
|  | 172 | + | 
|  | 173 | +This query will get optimized automatically as star-tree index will be used. | 
|  | 174 | + | 
|  | 175 | +You can set the `indices.composite_index.star_tree.enabled` setting to `false` to run queries without using star-tree index. | 
0 commit comments