-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvdb+channeld: extend kvdb
with Prefetch
for prefetching buckets in one go and speed up payment control by prefetching payments on hot paths
#5640
Conversation
f3881ba
to
096048a
Compare
@@ -159,6 +176,8 @@ func TestFirstLastNextPrev(t *testing.T) { | |||
require.NoError(t, err) | |||
|
|||
apply := func(stm STM) error { | |||
stm.Prefetch(prefetchKeys, prefetchRange) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this add sufficient test coverage for the changes? Maybe scenarios like deleting prefetched keys, prefetching after a delete, overwritting prefetches, etc are also candidates for testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I'll add some more tests.
RBucket | ||
|
||
// Prefetch will attempt to prefetch all values under a path. | ||
Prefetch(paths ...[]string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could use a better explanation of what Prefetch
really does from the caller point of view. It loads the given values/ranges in memory and holds them until they are requested via a Get
in the same transaction? And probably also good to describe how this interacts with all the other bucket operations (Delete
, CreateBucket
, etc) and whether the data remains consistent.
An extension of the generic kvdb-level tests would be a useful safe-guard too here and helps with documenting the behavior as well.
I am also wondering how to implement prefetching for Postgres. The bucket interface as proposed here gives a lot of freedom to call Prefetch
where ever one wants. I think the implementation can get quite complicated (and risky) if the prefetched keys need to be kept consistent.
Isn't there a more minimalistic alternative that works just as well for performance? One idea that I had is to convert Prefetch
to something like GetMulti
that directly returns all the buckets/data ranges that were fetched in an optimized way. Then it is clear that the data is just that and doesn't keep on living in a background cache that needs to be kept consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetMulti
could work, but in case where we also want to spare the roundtrips that come from just nesting into buckets how would that work semantically? In this case we prefetch based on the payment hash, fetching several bucket keys so whenever we nest we won't go to the DB in the same transaction.
Edit: also we do fetch all keys in the bucket of the payment and the htlc attempts bucket with all keys in one go. So how would we go around that? Return a nested map structure with all the key/values and buckets?
Edit (2): what about error cases, where the DB is corrupted, missing keys/buckets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a GetMulti
that gets all keys in a bucket should be quite simple, but I don't know how much of the bottleneck will still be left because indeed the bucket also needs to be reached first. Could be worth testing.
For including the down traversal in the same transaction too, GetMulti
might take as an input a tree of keys perhaps. In that case, can't you do the exact same thing that you do currently, but just hidden? So in the implementation of GetMulti
for etcd
, you'd internally call prefetch
first followed by a (recursive) series of get
calls to populate the result tree that you mention?
For postgres, GetMulti(tree)
in a single tx would be an interesting challenge. Either magic with subqueries, or a change to the db structure to make it possible to retrieve deeply nested keys in one step (using the sha256 ids perhaps).
For errors, I'd think that just returning an error from GetMulti
can be enough because currently we probably also don't do anything special when a condition happens that isn't supposed to happen ever? Missing keys is probably happy flow, but I think that can be signaled in the result tree.
If something like this can work, I think it is worth looking into. The Prefetch
approach may work for client-side transactions, but I think it gets hard when connected to database backends that manage the transaction server-side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we actually need the Prefetch
for Postgres? I recall having stable performance on Postgres where seemingly roundtrips did not bring any slowdowns over time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding the Prefetch
and GetMulti
as well? Where GetMulti
would simply work on the current bucket so we'd at least avoid the roundtrip overhead coming from the cursor?
Edit: it's a nice synergy in that Prefetch
does help for etcd and GetMulti
simply returns the prefetched items, whereas for Postgres GetMulti
would just reduce rountrips for the current bucket.
Edit (2): Perhaps GetAll
is better naming than GetMulti
in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Postgres doesn't need it for stability as far as I know, but reducing round-trips should surely increase performance. If the current gain for etcd
can also be achieved with a slightly different interface that is compatible with server-side transactions, I think it is a win.
Has the root cause of the etcd slowdown been found in the mean time? Maybe the prefetching fixes the problem on some machines, but not on others. Or there are other unknown factors that could cause it to return again in nodes with a different life than the benchmark nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If just GetAll
is sufficient for postgres
, I'd think that it should also be sufficient for etcd
? The round-trip itself likely is the most costly part and I'd think that round-trip counts should be comparable across backends.
Also it would probably make the delta of stm.go
smaller and save time in adding sufficient test coverage for all the ways in which you can use Prefetch
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems this PR is moving to the final stages, but shouldn't the Prefetch
API be compatible with database backends that support transactions server-side? @Roasbeef
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea is that we have Prefetch
as optional requirement (see kvdb/interface.go
) and we could extend (in a separate PR) with GetAll
as suggested. The reason is that Prefetch
can make it possible to fetch a payment in one roundtrip while GetAll
can optimize on ForEach
calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If for optimal performance it is required to get a payment in one round trip (makes sense), how can a separate PR that adds GetAll
deliver that? It still won't be a single round trip. Implementing Prefetch
as it is currently proposed isn't feasible for server-side transaction-based systems in my opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a first pass to understand the added logic to the STM code. I think I now understand most of it 😅 Nice work with this to reduce the number of round trips needed 🎉
096048a
to
72e10cc
Compare
Visit https://dashboard.github.orijtech.com?back=0&pr=5640&remote=true&repo=bhandras%2Flnd to see benchmark details. |
Can be rebased now with the dependent PR merged! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completed an initial pass, def need a few more to ensure I fully grok the proposed changes.
Commit wise, I think things may be easier to follow if one commits moves the existing slice of the read set into the b-tree, then subsequent commits introduce the explicit pre fetch, with the new methods being added in a final step.
@@ -452,20 +652,8 @@ func (s *stm) Get(key string) ([]byte, error) { | |||
return []byte(put.val), nil | |||
} | |||
|
|||
// Populate read set if key is present in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So a main difference here is that the prefetch set is declared up front rather than reduced from the initial read and write sets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the original code is a fork of etcd's own STM implementation which has a separate prefetch set to optimize for when we read less on next retry (to reduce the number of cmp ops on commit).
This (that we read less upon retry) isn't very common in our case so I though it might be a good idea to reduce complexity and just get rid of the prefetch. From my tests it seems that we actually retry very-very rarely, so imho it's not a super important optimization to have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deduced*
|
||
fetchPrefixes := make([]string, 0, len(prefixes)) | ||
for _, prefix := range prefixes { | ||
if s.rset.hasFullRange(prefix) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the read set has just been created yet and we haven't pre-fetched yet, won't this always be false? In that we wouldn't have pulled in any of these keys?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to make sure that if the user calls Prefetch
with already prefetched keys/ranges in the same transaction then we won't fetch them again. One such example is if we retry.
4a89b84
to
445cc15
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to all reviewers! As next steps I'll break commit structure up a bit more and will add more tests as Joost suggested.
@@ -452,20 +652,8 @@ func (s *stm) Get(key string) ([]byte, error) { | |||
return []byte(put.val), nil | |||
} | |||
|
|||
// Populate read set if key is present in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the original code is a fork of etcd's own STM implementation which has a separate prefetch set to optimize for when we read less on next retry (to reduce the number of cmp ops on commit).
This (that we read less upon retry) isn't very common in our case so I though it might be a good idea to reduce complexity and just get rid of the prefetch. From my tests it seems that we actually retry very-very rarely, so imho it's not a super important optimization to have.
|
||
fetchPrefixes := make([]string, 0, len(prefixes)) | ||
for _, prefix := range prefixes { | ||
if s.rset.hasFullRange(prefix) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to make sure that if the user calls Prefetch
with already prefetched keys/ranges in the same transaction then we won't fetch them again. One such example is if we retry.
@@ -159,6 +176,8 @@ func TestFirstLastNextPrev(t *testing.T) { | |||
require.NoError(t, err) | |||
|
|||
apply := func(stm STM) error { | |||
stm.Prefetch(prefetchKeys, prefetchRange) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I'll add some more tests.
445cc15
to
faf6128
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, really looking forward to the performance improvement with this.
The etcd itest is still failing but with the known flake in the update_channel_policy
test, which seems to not be affected by any changes in this PR.
This commit adds more full range caching to the STM such that we'll be able to prefetch keys and ranges on the kvdb level. This will allow us to reduce roundtrips to etcd do fast iteration of small ranges.
This commit extends the kvdb interface in a backwards compatible way such that we'll be able to prefetch all keys in a bucket in one go reducing the number of roundtrips.
faf6128
to
23ded04
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 💣
This PR is a reworked and cleaned up split from #5392. With this API extension
kvdb
drivers are able to add bucket prefetches to reduce the number of round trips later on in a given transaction. The PR adds this functionality to theetcd
driver as well as demonstrates the speedup it provides on the flattened htlc attempts bucket.The PR builds on: #5635