Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promtail WAL support: Implement writer side #8267

Merged
merged 43 commits into from
Feb 6, 2023
Merged

Promtail WAL support: Implement writer side #8267

merged 43 commits into from
Feb 6, 2023

Conversation

thepalbi
Copy link
Contributor

@thepalbi thepalbi commented Jan 24, 2023

What this PR does / why we need it:

This PR is the first in a series that will implement WAL support into Promtail. The main objective of this PR is implement the writer side of WAL.
Note that enabling WAL with the implementation as is will cause entries scraped to be written to WAL, but not remote written. Therefore, should not be used until the whole implementation is completed.

Apart form this, this PR includes a new primitive that will be extender in the following editions: client.Manager. This new component will be responsible for instantiating the correct remote write client, and handle the orchestration from the channel scraping targets write to, to the clients themselves. Also, will handle the graceful shutdown of bespoke clients.

Which issue(s) this PR fixes:
Part of #8197

Special notes for your reviewer:

  • Since there's a licensing conflict between primitives under the ingester package (AGPL), dependencies have been copied over under promtail, which is Apache.

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • CHANGELOG.md updated
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/upgrading/_index.md

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

clients/pkg/promtail/wal/reader.go Outdated Show resolved Hide resolved
clients/pkg/promtail/client/manager.go Outdated Show resolved Hide resolved
clients/pkg/promtail/client/manager.go Outdated Show resolved Hide resolved
clients/pkg/promtail/client/manager.go Outdated Show resolved Hide resolved
pkg/storage/stores/series_store_write_test.go Outdated Show resolved Hide resolved
clients/pkg/promtail/wal/reader.go Outdated Show resolved Hide resolved
clients/pkg/promtail/client/manager.go Outdated Show resolved Hide resolved
clients/pkg/promtail/client/manager.go Outdated Show resolved Hide resolved
clients/pkg/promtail/client/manager.go Outdated Show resolved Hide resolved
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

1 similar comment
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

Comment on lines 170 to 182
var entriesSink api.EntryHandler = p.client

// If WAL is enabled, instantiate the WAL itself, it's writer, and use that as entries sink for all scraping targets
if cfg.WAL.Enabled {
pWAL, err := wal.New(cfg.WAL, p.logger, p.reg)
if err != nil {
return fmt.Errorf("error starting WAL: %w", err)
}
p.walWriter = wal.NewWriter(pWAL, p.logger, p.client)
entriesSink = p.walWriter
}

tms, err := targets.NewTargetManagers(p, p.reg, p.logger, cfg.PositionsConfig, entriesSink, cfg.ScrapeConfig, &cfg.TargetConfig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it probably makes more sense to have promtail have a field that is an entry handler, and during the loading of the config here we create an entry handler that wraps both the list of clients and the walWriter, rather than doing the propagation of the channel data within the wal writer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yhea same, wasn't that happy with how it looked adding the fanout inside the writer. Moved it to promtail.go, and added a new slim component that does the fanout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this commit the fanout is moved out of the Writer, and done in promtail.go via a EntryHandlerFanouter, component that takes n entry handlers and exposes a single one, performing a fanout. Could be re-used when the client.Manager needs to implement the same functionality as the current multiclient.

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@thepalbi
Copy link
Contributor Author

@cstyan , the latest changes should be from this commit onwards. I tried to remove all unnecessary changes after the rebase, and sorted all imports to make the reviewal process easier. To give a summary for what I changes since last time:

  • Do fanout to wal writer and client manager as mentioned here
  • Implemented simple old wal segments cleaner, this is the starting commit, but you could just look for in under pkg/promtail/wal/writer.go
  • Cleanups

I implemented the old segments cleanup in this PR to at least support alongside with this PR, a very simple cleanup system allows a wal-enabled promtail to run without running the host out of disk

clients/pkg/promtail/wal/config.go Show resolved Hide resolved
clients/pkg/promtail/wal/writer.go Show resolved Hide resolved
clients/pkg/promtail/utils/entries.go Outdated Show resolved Hide resolved
clients/pkg/promtail/utils/entries.go Outdated Show resolved Hide resolved
clients/pkg/promtail/utils/entries.go Outdated Show resolved Hide resolved
clients/pkg/promtail/wal/writer.go Outdated Show resolved Hide resolved
Copy link
Contributor

@cstyan cstyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

almost there, I think we just need a nice way to shutdown the fanout handler and ensure data that was queued in it's channel is sent

clients/pkg/promtail/utils/entries.go Outdated Show resolved Hide resolved
clients/pkg/promtail/utils/entries.go Outdated Show resolved Hide resolved
Comment on lines 174 to 188
// these are the sinks were all scraped log entries should get to
var entryHandlers = []api.EntryHandler{p.client}

// If WAL is enabled, instantiate the WAL itself, it's writer, and use that as entries sink for all scraping targets
if cfg.WAL.Enabled {
p.walWriter, err = wal.NewWriter(cfg.WAL, p.logger, p.reg)
if err != nil {
return fmt.Errorf("failed to create wal writer: %w", err)
}
entryHandlers = append(entryHandlers, p.walWriter)
}

p.entriesFanout = utils.NewFanoutEntryHandler(timeoutUntilFanoutHardStop, entryHandlers...)

tms, err := targets.NewTargetManagers(p, p.reg, p.logger, cfg.PositionsConfig, p.entriesFanout, cfg.ScrapeConfig, &cfg.TargetConfig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last comment, I think; it feels to me like the creation of the WAL writer should be above within the previous else if cfg.WAL.Enabled, and we could reassign the fanout writer to p.client

if the usage of the actual Client type for p.client makes that difficult lets just find a way to combine these two if statements so that we don't make our job harder when refactoring, we could make p.walWriter a pointer, and check if the pointer isn't null when deciding whether to create the entries fanout?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yhea also wasn't fond of that double if now that you mention. Moved a bit the code so that we have a single if branch related to wal-enabled behaviour. I think we can keep the fanout if wal is disabled, since it adds the graceful shutdown which is not there otherwise.

Also, we should keep a Client implementing interface in p.client, since it's part of the interface the main promtail objects exposes. In the future, this will be just the client.Manager.

Let me know wdyt!

Copy link
Contributor

@cstyan cstyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for all your hard work here @thepalbi

I'll wait until Monday morning before merging just in case any else has been meaning to take a look at this.

@cstyan cstyan merged commit b5ed39c into main Feb 6, 2023
@cstyan cstyan deleted the pablo/wal-part-1 branch February 6, 2023 23:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants