-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor state_* metricsets to share response from endpoint #25640
Conversation
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Pinging @elastic/integrations (Team:Integrations) |
@exekias, @jsoriano I would love your feedback on this. @kaiyan-sheng might be interested into this too since sth similar could be applied on AWS related stuff too. I still need to verify if this change will have impact on Agent's deployments too since I'm not sure if Agent spawns metricsets directly per data_stream. If that's the case then this change will only have impact for Beats' deployments. |
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
Trends 🧪💚 Flaky test reportTests succeeded. Expand to view the summary
Test stats 🧪
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments.
This approach could work, I wonder if it will make things too complex? I wonder if it would make sense to put everything together in the same metricset, did you consider that option?
if m.prometheus == nil { | ||
m.prometheus = prometheus | ||
} | ||
go m.runStateMetricsFetcher(period) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a race condition here. If this function is called by two metricsets at the same time you could end up with several fetchers running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, what if different metricsets have a different period?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, what if different metricsets have a different period?
Period is defined at the module level, so all metricsets will have the same period.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both cases should be handled by https://github.com/elastic/beats/pull/25640/files#diff-9933f42aec81125910c9923f83fa89c9a76918aac6e3bd5a68c430aeeab91084R98-R106.
Taking into consideration that period is set on Module's level then I think that different period's check is redundant. Metricsets will not share the input if they are configured in different Modules like:
- module: kubernetes
metricsets:
- state_pod
- module: kubernetes
metricsets:
- state_container
So the period adjustment at https://github.com/elastic/beats/pull/25640/files#diff-9933f42aec81125910c9923f83fa89c9a76918aac6e3bd5a68c430aeeab91084R100 could be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- module: kubernetes metricsets: - state_pod - module: kubernetes metricsets: - state_container
Take into account that this configuration will instantiate two different modules, and they won't reuse the shared families. If we want/need to supports configs like these ones, this approach won't work. To reuse the families we would need one of:
- A global cache, indexed by the endpoints used.
- Merge everything in a single metricset, but this has other challenges as mentioned in Refactor state_* metricsets to share response from endpoint #25640 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you investigate what Agent is doing with a config like that? I think doing this at the module level is good enough if the Agent handles it correctly, and removes the complexity of indexing by endpoint, metrics path and other parameters (including the security part).
It had been discussed and rejected in the past like at #12938 and recently we also end up with keeping as is and coming with this improvement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some suggestions that could simplify things.
if m.prometheus == nil { | ||
m.prometheus = prometheus | ||
} | ||
go m.runStateMetricsFetcher(period) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, what if different metricsets have a different period?
Period is defined at the module level, so all metricsets will have the same period.
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for beats, we will have to check if this solves the issue in Agent (#25640 (comment)). Added only a small suggestion.
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Tested that with Agent and figured out that metircsets are spawned under different Module objects so I changed the implementation to use global cache on Module's level shared across all Module's instances. Did some manual tests and it seems from the logs that context is reused. I will test it more. Agent's standalone config:
Some sample Debug logs:
|
defer m.lock.Unlock() | ||
|
||
now := time.Now() | ||
hash := fmt.Sprintf("%s%s", m.BaseModule.Config().Period, m.BaseModule.Config().Hosts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use a common function to generate the hash, this hash is being calculated also in the function returned by ModuleBuilder()
.
And/or consider doing the initialization of m.fCache[hash]
here so it is not needed to calculate the hash when initializing the module.
m.lock.Lock() | ||
defer m.lock.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lock is in the module, but the cache is shared between all modules. The cache should have its own lock, and be the same for all modules/metricsets using it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch thanks!
fCache.lastFetchTimestamp = now | ||
fCache.setter = ms | ||
} else { | ||
m.logger.Warn("REUSE families for ms: ", ms, ". Last setter was ", fCache.setter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeap I will remove them completely.
if ms != fCache.setter { | ||
m.logger.Warn("DIFF[ms!=cacheSetter]: ", ms, " != ", fCache.setter) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this only to report that the metricset getting the families is different to the metricset that stored it? Not sure if needed, in any case log it at the debug level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be removed.
} | ||
|
||
// New create a new instance of the MetricSet | ||
// Part of new is also setting up the configuration by processing additional | ||
// Part of newF is also setting up the configuration by processing additional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo?
// Part of newF is also setting up the configuration by processing additional | |
// Part of new is also setting up the configuration by processing additional |
sharedFamiliesCache := make(cacheMap) | ||
return func(base mb.BaseModule) (mb.Module, error) { | ||
hash := fmt.Sprintf("%s%s", base.Config().Period, base.Config().Hosts) | ||
sharedFamiliesCache[hash] = &familiesCache{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These entries will be never removed, this can be a leak if metricbeat is used to monitor clusters dynamically created. I guess this is only a corner case, we can leave this by now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will leave a comment about this in the code so as to have a good pointer if an issue arise in the future. One thing we could do (on top of my head suggestion follows) to tackle this could be to have a method on module level to figure out what entries to remove, which method will be called from Metricset's Close().
defer m.lock.Unlock() | ||
|
||
now := time.Now() | ||
hash := fmt.Sprintf("%s%s", m.BaseModule.Config().Period, m.BaseModule.Config().Hosts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the period needs to be part of the hash key, it is ok if metricsets with the same hosts but different period share the families.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I will remove it.
Signed-off-by: chrismark <chrismarkou92@gmail.com>
This pull request is now in conflicts. Could you fix it? 🙏
|
Signed-off-by: chrismark <chrismarkou92@gmail.com>
// NOTE: These entries will be never removed, this can be a leak if | ||
// metricbeat is used to monitor clusters dynamically created. | ||
// (https://github.com/elastic/beats/pull/25640#discussion_r633395213) | ||
sharedFamiliesCache[hash] = &familiesCache{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This map is being written every time a module is created. As it is now, I see two possible problems:
- There can be race conditions (and panics) if several metricsets are created at the same time (not sure if possible), or if a metricset calls
GetSharedFamilies
while other metricset with the same hosts is being created (I guess this can happen with bad luck and/or with a lowmetricbeat.max_start_delay
). - If a metricset is created after another one has already filled the cache, the cache will be reset, not a big problem, but could be easily solved by checking if the cache entry exists.
I think reads and writes on this map should be also thread safe. And ideally we should check if there is some entry in the cache for a given key before overwriting it here.
Signed-off-by: chrismark <chrismarkou92@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is looking good. Only some nitpicking comments added.
fCache := m.kubeStateMetricsCache.cacheMap[hash] | ||
if _, ok := m.kubeStateMetricsCache.cacheMap[hash]; !ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fCache := m.kubeStateMetricsCache.cacheMap[hash] | |
if _, ok := m.kubeStateMetricsCache.cacheMap[hash]; !ok { | |
fCache, ok := m.kubeStateMetricsCache.cacheMap[hash] | |
if !ok { |
|
||
fCache := m.kubeStateMetricsCache.cacheMap[hash] | ||
if _, ok := m.kubeStateMetricsCache.cacheMap[hash]; !ok { | ||
return nil, fmt.Errorf("Could not get kube_state_metrics cache entry for %s ", hash) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return nil, fmt.Errorf("Could not get kube_state_metrics cache entry for %s ", hash) | |
return nil, fmt.Errorf("could not get kube_state_metrics cache entry for %s ", hash) |
} | ||
|
||
func (m *module) GetSharedFamilies(prometheus p.Prometheus) ([]*dto.MetricFamily, error) { | ||
now := time.Now() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be done after getting the lock? If not, if a call to GetFamilies
takes more than the period, all waiting metricsets will request the families again instead of reusing the ones just received, and the waiting metricset that end up requesting the families again will store an "old" timestamp.
} | ||
|
||
func generateCacheHash(host []string) string { | ||
return fmt.Sprintf("%s", host) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using something like https://github.com/mitchellh/hashstructure for hashing.
cacheMap: make(map[string]*familiesCache), | ||
} | ||
return func(base mb.BaseModule) (mb.Module, error) { | ||
hash := generateCacheHash(base.Config().Hosts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In principle the hash is always going to be the same during the life of this module. Wdyt about storing it in module{}
so it doesn't need to be recalculated every time? Actually, for the same reason, the module could keep a reference to the cache entry directly.
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
What does this PR do?
This PR changes how
kubernetes
module handle state_* metricsets which share same target endpoint. The idea originates from https://github.com/elastic/beats/blob/master/x-pack/metricbeat/module/cloudfoundry/cloudfoundry.goNote: At this point the PR stands more like a PoC with changes only applied atstate_container
andstate_pod
metricsets.If we agree with this solution (and make sure that it would be applied with Agent too) we can extend it to the rest ofstate_*
metricsets as well as to metricsets fetch from kubelet's endpoint (node
,pod
,container
,volume
,system
)In upcoming PR we will apply similar strategy for metricsets fetching from kubelet's endpoint (
node
,pod
,container
,volume
,system
)Why is it important?
To improve the performance of the module by avoid fetching same content multiple times.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Author's Checklist
Verified with different module's config blocks. See step 4 of testing notes below.
How to test this PR locally
kubectl -n kube-system port-forward svc/kube-state-metrics 8081:8080
tcpdump -i any -s 0 'tcp[((tcp[12:1] & 0xf0) >> 2):4] = 0x47455420'
Verify with tcpdump's output that only one request takes place no matter how many modules/metricsets are enabled.
Verify with tcpdump's output that only one request takes place no matter how many modules/metricsets are enabled.
Related issues