Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MM-55225] add simple scraper logic #4

Closed
wants to merge 4 commits into from
Closed

[MM-55225] add simple scraper logic #4

wants to merge 4 commits into from

Conversation

isacikgoz
Copy link
Member

Summary

Add the scraper logic, for now it just scrapes but does not writes to the tsdb, that will be taken care of with the https://mattermost.atlassian.net/browse/MM-55226

Ticket Link

https://mattermost.atlassian.net/browse/MM-55225

@isacikgoz isacikgoz added the 2: Dev Review Requires review by a core committer label Dec 14, 2023
server/const.go Outdated
sampleMutator func(labels.Labels) labels.Labels //nolint:unused
sampleMutator func(labels.Labels) labels.Labels //nolint:unused
errBodySizeLimit = errors.New("body size limit exceeded")
UserAgent = fmt.Sprintf("%s/%s", PluginName, ScraperVersion)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change this to a constant?

Also, it looks slightly weird to see vars in a file named const.go 😛

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, sure I'll re-organize no worries.

server/plugin.go Outdated
@@ -113,13 +120,42 @@ func (p *Plugin) OnActivate() error {
return mutateSampleLabels(l, target, *p.configuration.HonorTimestamps, []*relabel.Config{})
}

ticker := time.Tick(time.Duration(scrapeInterval) * time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend to use time.Ticker instead and explicitly stop it when no longer needed.

server/plugin.go Outdated
for {
select {
case <-p.closeChan:
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to signal until this statement has hit. When you run close(ch), it does not guarantee that the receiving end will have received the event. So we need something similar to the websocket closing mechanism. Either another channel, or a sync.WaitGroup.

So on the caller side:

close(ch)
<-closed

callee side

case <-p.closeChan:
  closed <- struct{}{}
  return

Or the same can be done via a WaitGroup as well. 0/5.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can ditch this if we use time.Ticker and close that while exiting.

server/plugin.go Outdated
case <-p.closeChan:
return
case <-ticker:
cfg, err := p.configuration.Clone()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can potentially be costly. Every Clone is a json Marshal/Unmarshal. Wondering if we can ensure to not modify the config in the scrapeFn? It seems like we are simply reading some config fields in there.

server/plugin.go Outdated
_ = level.Error(logger).Log("msg", "configuration could not be cloned", "err", err)
break
}
buf := new(bytes.Buffer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would suggest to move the creation outside the loop. And then reset the buffer (buf.Reset) for every scrape. This way we reduce GC.

server/plugin.go Outdated
@@ -91,10 +97,11 @@ func (p *Plugin) OnActivate() error {
if host == "" {
host = "localhost"
}
targetAddres := host + ":" + port
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

server/scrape.go Outdated
Comment on lines 40 to 42
if *conf.BodySizeLimitBytes <= 0 {
conf.BodySizeLimitBytes = model.NewInt64(math.MaxInt64)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really intentional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, not a likely case TBH.

server/scrape.go Outdated
Comment on lines 48 to 50
if n >= *conf.BodySizeLimitBytes {
return "", errBodySizeLimit
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this check when the LimitReader guarantees this? Seems like we are testing the standard library here.

Copy link
Member Author

@isacikgoz isacikgoz Dec 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Removing. But LimitReader won't return an error, it would just read the limit we have set.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does return an io.EOF error: https://pkg.go.dev/io#LimitReader. It's just that it isn't very indicative of what happened. I'd say let's start with this for now. We can move to counting the bytes read in case there's a need.

server/scrape.go Outdated
return "", err
}

n, err := io.Copy(w, io.LimitReader(gzipr, *conf.BodySizeLimitBytes))
Copy link
Member

@agnivade agnivade Dec 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a cleaner way would be to detect the encoding, and then wrap the reader in a gzip reader in an if condition, and then do the io.Copy as the common code outside the if condition. Looks like that would reduce a lot of the repetition here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes much sense, updated accordingly.

@isacikgoz isacikgoz removed the request for review from agarciamontoro December 20, 2023 16:33
@isacikgoz isacikgoz mentioned this pull request Dec 20, 2023
@isacikgoz isacikgoz closed this Dec 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2: Dev Review Requires review by a core committer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants