-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Cache vectors usage stats #74974
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache vectors usage stats #74974
Conversation
Today `VectorsUsageTransportAction` is pretty heavyweight since it must decompress and read the mappings for every index in the cluster. In particular Metricbeat hits this action every 10s by default, and it runs on the elected master, which causes nontrivial load in an otherwise quiet cluster. This commit introduces a cache for the usage stats, keyed by index, avoiding recomputing these stats in the common case that the mapping hasn't changed.
Pinging @elastic/es-search (Team:Search) |
@DaveCTurner Thanks very much for your PR, and sorry that this part was a bottleneck for metrics. But I think a better approach should be to completely remove xpack vectors usage stats, as we already report mappings stats of all fields in Confirming with @giladgal that it is ok to move from the current: GET _xpack/usage
...
"vectors" : {
"available" : true,
"enabled" : true,
"dense_vector_fields_count" : 3,
"sparse_vector_fields_count" : 0,
"dense_vector_dims_avg_count" : 3
} to cluster stats API: GET _cluster/stats
...
{
"name" : "dense_vector",
"count" : 3,
"index_count" : 2
} which should be consistent how we report mapping stats for other fields. @DaveCTurner Sorry for the trouble! |
Not at all, doing no work is always better than doing less work 😁 Can we remove these stats in 7.x tho? It looks like a breaking change to me. It also seems to lose the |
We have clarified in the past that usage stats updates are not considered breaking, so it is safe to remove those in 7.x About missing |
That's fine by me. Thanks. |
We have already decided not to have xpack usage for field mappers (see elastic#53076). As mappings stats of all fields is already tracked in cluster stats. Moreover xpack usage for vector field is a quite expensive operation (see elastic#74974). This removes xpack actions for vector field.
I've created a PR to remove xpack vector usage. |
Great, thanks Mayya. Closing this in favour of #75017 |
We have already decided not to have xpack usage for field mappers (see elastic#53076). As mappings stats of all fields is already tracked in cluster stats. Moreover xpack usage for vector field is a quite expensive operation (see elastic#74974). This removes xpack actions for vector field. Backport for elastic#75017
Today
VectorsUsageTransportAction
is pretty heavyweight since it mustdecompress and read the mappings for every index in the cluster. In
particular Metricbeat hits this action every 10s by default, and it runs
on the elected master, which causes nontrivial load in an otherwise
quiet cluster.
This commit introduces a cache for the usage stats, keyed by index,
avoiding recomputing these stats in the common case that the mapping
hasn't changed.