-
Notifications
You must be signed in to change notification settings - Fork 53
Tests for unsharding PR #99
Tests for unsharding PR #99
Conversation
dafe0c3
to
9fbc4b7
Compare
912b849
to
790373d
Compare
068bcdd
to
3301178
Compare
445e541
to
444e5b8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some general comments on approach here since it looks like the PR is still a WIP.
// FIXME: Remove. We don't actually store "value nodes". This confusing | ||
// abstraction just removes the maxpadlen from the link names to extract | ||
// the actual value link the trie is storing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding of your comment here is that val.Name
and maxpadlen
can be used to compute key
and so we shouldn't bother caching it here and just recompute it when needed. Is that correct?
Also, can you clarify if this a FIXME intended for this set of PRs?
hamt/hamt.go
Outdated
// EnumLinksAsync returns a channel which will receive Links in the directory | ||
// as they are enumerated, where order is not guaranteed | ||
func (ds *Shard) EnumAllAsync(ctx context.Context) <-chan format.LinkResult { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't needed anymore given EnumLinksAsync
should be concurrent, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems not, and also seems to be unused. Removing
hamt/util.go
Outdated
// the fake hash as in io.SetAndPrevious through `newHashBits()` and pass | ||
// it as an argument making the hash independent of tree manipulation; that | ||
// sounds as the correct way to go in general and we wouldn't need this.) | ||
func CreateCompleteHAMT(ds ipld.DAGService, treeHeight int) (ipld.Node, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
functions only used by go tests should live in _test.go files if possible.
Note: if you're running into dependency cycle issues or issues with breaking the API a tool that may be your friend here is the internal
package which allows things to be exported for use within the module, but not outside of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: if you're running into dependency cycle issues or issues with breaking the API a tool that may be your friend here is the internal package which allows things to be exported for use within the module, but not outside of it.
Could you expand a bit more on that? I'm not hitting many useful results Googling golang internal package import cycle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example of using the internal package (although they probably abuse it) is in go-bitswap e.g. https://github.com/ipfs/go-bitswap/blob/master/internal/peermanager/peermanager.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
7a7204f
to
16a96a3
Compare
599c09e
to
4568234
Compare
4568234
to
d60be37
Compare
* wip * fix CreateCompleteHAMT test * relax test constraint * fix value of nodes fetched in test according to a BFS walk
* move CreateCompleteHAMT to an internal package * change package name: internal -> private * move hash function to internal pacakge * make link size function internal * make shard width private * add internal * go fmt * fix import order * export DefaultShardWidth back again * move IdHash to private package * move LinkSizeFunction to private * add linksize
hamt/hamt.go
Outdated
// FIXME: Check which functions do we need to actually expose. | ||
func (ds *Shard) EnumAll(ctx context.Context) ([]*ipld.Link, error) { | ||
var links []*ipld.Link | ||
|
||
linkResults := ds.EnumAllAsync(ctx) | ||
|
||
for linkResult := range linkResults { | ||
if linkResult.Err != nil { | ||
return links, linkResult.Err | ||
} | ||
links = append(links, linkResult.Link) | ||
} | ||
return links, nil | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this function is a duplicate of EnumLinks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@schomatis I pushed a commit removing this
hamt/hamt.go
Outdated
internal.HAMTHashFunction = murmur3Hash | ||
} | ||
|
||
func (ds *Shard) IsValueNode() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be unexported again, it's not used anywhere outside this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@schomatis I pushed a commit for this
io/directory.go
Outdated
@@ -316,11 +344,11 @@ func (d *BasicDirectory) GetCidBuilder() cid.Builder { | |||
} | |||
|
|||
// SwitchToSharding returns a HAMT implementation of this directory. | |||
func (d *BasicDirectory) SwitchToSharding(ctx context.Context) (*HAMTDirectory, error) { | |||
func (d *BasicDirectory) SwitchToSharding(ctx context.Context, shardWidth int) (*HAMTDirectory, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a breaking change that will have to be propagated upstream, but seems reasonable 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT this isn't used by tests or anywhere else. What was the reason to make this change, just to align with hamt.NewShard()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can revert it back if you want.
- Add tests for automatic unsharding - Modified some internals to be sufficiently extensible for testing
Tests for #94.
Concurrent test being worked independently in #106.
Preliminary design review requested:
io/directory_test.go
, particularlyTestHAMTEnumerationWhenComputingSize
andcountGetsDS
.(Tests failing for Ubuntu but that might be expected, see more details inTestHAMTEnumerationWhenComputingSize
in linked file.)sizeBelowThreshold
logic and making sure we are not fetching more nodes than we should. Making this the priority as it's the trickiest part to test since it's an optimization and the unsharding will work just the same without it which makes problems here harder to detect (the rest is evaluating just if the directory switched from HAMT to Basic on different scenarios, seeTestUpgradeableDirectory
from base PR as an illustration).FIXMES
and rough edges around internal interfaces to simplify HAMT logic and make it easier to test. Addressing them now it's not a priority of this review but any feedback welcome. Priority was to get something working ASAP for an early review.