-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Local file storage extension #3087
Local file storage extension #3087
Conversation
Codecov Report
@@ Coverage Diff @@
## main #3087 +/- ##
==========================================
+ Coverage 91.76% 91.80% +0.03%
==========================================
Files 486 491 +5
Lines 23514 23616 +102
==========================================
+ Hits 21578 21681 +103
Misses 1431 1431
+ Partials 505 504 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
8cc4533
to
dafd279
Compare
Please rebase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like how minimalistic this turned out to be.
Can we add simple perf tests for the storage and publish the results in the PR?
The PR includes one additional testbed scenario that utilizes the new extension. It shows nearly no impact on performance. However, this usage of the extension by the
I've added a benchmark test directly to the client, and it unfortunately appears to perform quite poorly: There is a fairly obvious improvement, which I suggest we include in a separate PR. Stanza used a local |
531b722
to
ef08ded
Compare
This is surprisingly slow, by a few orders of magnitude slower than I would expect and it is not good enough for our other needs (ondisk buffers for exporter queues). On an SSD even if it fsyncs every time I would not expect 38ms per operation. Do you know which operation of the 3 (Get/Set/Delete) is slow? |
I should have mentioned, that the benchmark was performing all 3 operations per iteration, but obviously this is still very slow. I've split them into separate benchmarks, and it appears that each performs similarly.
I will try out the short-term local cache idea and report back. |
I am not sure I like this. I'd very much prefer a proper file storage that writes to file immediately (without fsync) so that if we crash the data is still stored. The performance we see is extremely slow and adding a in-memory cache defeats the purpose for high-rate cases. I think it is OK to merge this PR and ignore the PR for now, but then immediately look for a better KV storage without spending time on adding a cache. I cannot imagine a KV storage being so slow (is it possible we do something wrong?). Surely something else should exist that has more reasonable performance. |
OK, from what I see bolt by default fsyncs every transaction. Setting |
Thanks for looking into this Tigran. I see similar gains.
Based on the code comments on |
edad661
to
ac31de1
Compare
What are your reservations? From what I see with NoSync=true the data is still written to the file (as expected). The only difference is that there is no longer a fsync at the end of each transaction. This means that we are unprotected from hardware failures and from kernel crashes. We are still protected from our process crashes and restarts. I believe that is a good enough guarantee for what we use the storage for. I think it is acceptable for us to loose data or have corrupted data in case of catastrophic events like hardware failure or in case of kernel crashes. We can always add a config option in the future for users to enable fsync if they feel they need it, but I think a default of fsync=false is a fine approach. |
My reservation primarily comes from the comment no the
I think perhaps the author is just being overly cautionary and that I am echoing that out of ignorance on the implementation details. Your explanation makes sense to me though. I think you probably have a better understanding and higher level of confidence than I on the nuances of file system operations, so I'm happy to defer to your judgement here and have updated the setting accordingly. I believe I have addressed all feedback, except what's captured in the following issues: #3148, #3149, #3150, #3152 |
I quickly traced through bolt code, obviously I don't know it in details, but I can see that the flag controls whether OS fsync is called or no (as I would expect). The warning in comment looks scary but I do not see anything in the code that contradicts my understanding. I suggest that we go with NoSync=true and we can adjust if needed. |
LGTM, please resolve the conflicts and we can merge. Is it possible to add a test that verifies that checkpoint storage works correctly with filelog? I think we should be able to start/stop/start the receiver and see if the checkpoints are picked up. Can be a separate PR. |
Thanks @tigrannajaryan, conflicts resolved. The first test in |
Yes, but it does not verify that the checkpoints functionality actually works, only that a key/value can be stored and retrieved. I want to see an end to end test. Create a log fie, collect part of it, shutdown the receiver, start it again, make sure it continued collection from the right place, nothing was missed, nothing collected twice. Again, no need to add to this PR, but I believe it is very important to have end to end tests like this. |
@tigrannajaryan Understood. Will track with #3157 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @djaglowski for making a great progress on this.
@pmm-sumo I merged this but if you have any comments, particularly with regards to the usage in the disk buffer please feel free to submit issues. |
This code looks great, I have one comment so far. If I read this piece correctly, there should be a single extension defined at a given time. Is that accurate? I was thinking if maybe the components using storage extension should have the ability to refer a specific one if more than one is configured, but I might be missing something here |
@pmm-sumo Thanks for taking a close look at it. Your interpretation is correct. The constraint is not necessarily meant to exist long term though - just seemed like a reasonable starting point. |
Follows open-telemetry#2883 Will resolve #2287 ### Summary This PR includes: - A full implementation of a `file_storage` extension, which can read and write data to the local file system. Any component in the collector may make use of this extension. - Updates to `stanza/internal` to allow stanza-based receivers to use the extension for checkpoints. - A new testbed scenario that has the filelogreceiver using the extension Configuration of the extension is simple. ```yaml file_storage: file_storage/all_settings: directory: /var/lib/otelcol/mydir timeout: 2s ``` The extension is made available to component's via the `host` parameter in their `Start` method: ```go func (r *receiver) Start(ctx context.Context, host component.Host) error { for _, ext := range host.GetExtensions() { if se, ok := ext.(storage.Extension); ok { client, err := se.GetClient(ctx, component.KindReceiver, r.NamedEntity) if err != nil { return err } r.storageClient = client return nil } } r.storageClient = storage.NewNopClient() ... } ``` # Conflicts: # go.mod
Follows #2883
Will resolve #2287
Summary
This PR includes:
file_storage
extension, which can read and write data to the local file system. Any component in the collector may make use of this extension.stanza/internal
to allow stanza-based receivers to use the extension for checkpoints.Configuration of the extension is simple.
The extension is made available to component's via the
host
parameter in theirStart
method: