-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement preallocated buffer #996
Conversation
y/iterator.go
Outdated
// this function exists is to avoid creating byte arrays per key-value pair in | ||
// table/builder.go. | ||
func (v *ValueStruct) EncodeToYBuf(buf *Buffer) { | ||
buf.WriteByte(v.Meta) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error return value of buf.WriteByte
is not checked (from errcheck
)
y/iterator.go
Outdated
// table/builder.go. | ||
func (v *ValueStruct) EncodeToYBuf(buf *Buffer) { | ||
buf.WriteByte(v.Meta) | ||
buf.WriteByte(v.UserMeta) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error return value of buf.WriteByte
is not checked (from errcheck
)
y/iterator.go
Outdated
buf.WriteByte(v.UserMeta) | ||
var enc [binary.MaxVarintLen64]byte | ||
sz := binary.PutUvarint(enc[:], v.ExpiresAt) | ||
buf.Write(enc[:sz]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error return value of buf.Write
is not checked (from errcheck
)
y/iterator.go
Outdated
var enc [binary.MaxVarintLen64]byte | ||
sz := binary.PutUvarint(enc[:], v.ExpiresAt) | ||
buf.Write(enc[:sz]) | ||
buf.Write(v.Value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error return value of buf.Write
is not checked (from errcheck
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 5 files at r2, 9 of 9 files at r4.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @ashish-goswami and @jarifibrahim)
table/builder.go, line 225 at r4 (raw file):
+---------+------------+-----------+---------------+ */ func (b *Builder) Finish() *y.Buffer {
Don't change the output from this. Keep it to []byte
.
y/y.go, line 355 at r4 (raw file):
// in chunks(pages). Once a chunk(page) is full, it will allocate double the size of previous // chunk. Its function are not thread safe. type Buffer struct {
In the first PR, can you just introduce the Buffer: Call it PageBuffer and tests for it.
y/y.go, line 472 at r4 (raw file):
// TODO: reader can be removed. func (b *Buffer) NewReader() io.Reader {
You can remove this, if not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 7 files reviewed, 7 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, and @manishrjain)
table/builder.go, line 225 at r4 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Don't change the output from this. Keep it to
[]byte
.
Done.
y/y.go, line 355 at r4 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
In the first PR, can you just introduce the Buffer: Call it PageBuffer and tests for it.
Done.
y/y.go, line 472 at r4 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
You can remove this, if not needed.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ A review job has been created and sent to the PullRequest network.
@ashish-goswami you can click here to see the review status or cancel the code review job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 10 files at r5.
Reviewable status: 3 of 7 files reviewed, 10 unresolved discussions (waiting on @ashish-goswami and @manishrjain)
y/y.go, line 352 at r5 (raw file):
// PageBuffer consists of many pages. A page is a wrapper over []byte. PageBuffer can act as a // replacement of bytes.Buffer. Instead of having single underlying buffer, it has multiple // underlying buffers. Hence it avoids any copy during relocation(as happenes in bytes.Buffer).
Typo: happenes => happens
y/y.go, line 353 at r5 (raw file):
// replacement of bytes.Buffer. Instead of having single underlying buffer, it has multiple // underlying buffers. Hence it avoids any copy during relocation(as happenes in bytes.Buffer). // Buffer allocates memory in pages. Once a page is full, it will allocate page with double the size
PageBuffer allocates memory in pages...
y/y.go, line 456 at r5 (raw file):
} // WriteTo writes whole buffer to w. It returns number of bytes returned and any error encountered.
typo: returned => written
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 7 files reviewed, 10 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, and @manishrjain)
y/y.go, line 352 at r5 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Typo: happenes => happens
Done.
y/y.go, line 353 at r5 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
PageBuffer allocates memory in pages...
Done.
y/y.go, line 456 at r5 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
typo: returned => written
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the changes here look like a good optimization. Although left one comment that you may want to double check usage or add more documentation around the shared buffer used for ReadAt
to avoid any issues in the future.
Reviewed with ❤️ by PullRequest
y/y.go
Outdated
if length > cap(b.readBuf) { | ||
b.readBuf = make([]byte, length) // Allocate whole buffer at start. | ||
} else { | ||
b.readBuf = b.readBuf[:length] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you mentioned on the PageBuffer
that its functions aren't thread safe. Although it also seems like multiple calls to ReadAt
will potentially mutate the data returned by previous calls since this b.readBuf
's underlying data is shared between calls. This might be worth adding to the docstring for this method and PageBuffer
because I could see this being pretty surprising and have data leakage/security issues if a buffer was ever shared across accounts or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Valid point. Thanks for pointing it out. Will add to comments.
y/y.go
Outdated
// NewPageBuffer returns a new PageBuffer with first page having size pageSize. | ||
func NewPageBuffer(pageSize int) *PageBuffer { | ||
b := &PageBuffer{curPageSize: pageSize} | ||
b.pages = make([]*page, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that this line is redundant currently since the append
on the line below will do this even if b.pages
is nil
, although it is fine to keep it if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed it. Thanks.
@@ -340,3 +341,128 @@ func BytesToU32Slice(b []byte) []uint32 { | |||
hdr.Data = uintptr(unsafe.Pointer(&b[0])) | |||
return u32s | |||
} | |||
|
|||
// page struct contains one underlying buffer. | |||
type page struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you plan to add more items to this struct or would it make sense to just do something like this instead:
type page []byte
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might add some functions to page.
y/iterator.go
Outdated
// EncodeToPageBuf should be kept in sync with the Encode function above. The reason | ||
// this function exists is to avoid creating byte arrays per key-value pair in | ||
// table/builder.go. | ||
func (v *ValueStruct) EncodeToPageBuf(buf *PageBuffer) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish there was a bytes.Buffer
interface that could be used instead of having to duplicate the code and keep these in sync but otherwise. But makes sense and thanks for adding the comment to this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. PageBuffer is experimental as of now. Once we decide to use it in Builder, we can do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove the usage in table builder for now, as we had discussed. Let's get the library in. Then we can figure out if it is worth using it in various places.
Reviewable status: 2 of 8 files reviewed, 13 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @pullrequest[bot])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 8 files reviewed, 10 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @pullrequest[bot])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 10 files at r5, 1 of 3 files at r7, 1 of 1 files at r8, 2 of 3 files at r9.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @ashish-goswami, @manishrjain, and @pullrequest[bot])
y/y.go, line 361 at r9 (raw file):
length int // Length of PageBuffer. curPageSize int // Size of last page allocated.
nextPageSize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to inactivity, PullRequest has cancelled this review job. You can reactivate the code review job from the PullRequest dashboard
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just merge the y.PageBuffer one.
Reviewable status: 4 of 6 files reviewed, 10 unresolved discussions (waiting on @ashish-goswami, @manishrjain, and @pullrequest[bot])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding so many tests. 👍
Reviewed 3 of 10 files at r5, 2 of 3 files at r7, 1 of 1 files at r8, 2 of 3 files at r9, 2 of 2 files at r10.
Reviewable status: all files reviewed, 16 unresolved discussions (waiting on @ashish-goswami, @manishrjain, and @pullrequest[bot])
y/y.go, line 381 at r10 (raw file):
break } data = data[n:]
I don't think we should do this. The caller of write
would never expect the passed sliced to be modified in any way. The documentation for io.Writer
interface clearly states
Write must not modify the slice data, even temporarily.
https://golang.org/pkg/io/#Writer
y/y.go, line 392 at r10 (raw file):
// WriteByte writes data byte to PageBuffer and returns any encountered error. func (b *PageBuffer) WriteByte(data byte) error { _, err := b.Write([]byte{data}) // Can be changed later.
I don't understand the comment. What can be changed later?
y/y_test.go, line 42 at r10 (raw file):
rand.Seed(time.Now().Unix()) var bytesBuffer bytes.Buffer // This is just of verifying result.
just of for verifying the result.
y/y_test.go, line 144 at r10 (raw file):
require.NoError(t, err, "unable to read error") require.True(t, n == len(rb), "length read should be equal") // Read same number of bytes using ReaderAt.
This comment looks outdated.
y/y_test.go, line 152 at r10 (raw file):
require.NoError(t, err, "unable to read error") require.True(t, n == 10, "length read should be equal") // Read same number of bytes using ReaderAt.
This comment looks outdated.
y/y_test.go, line 158 at r10 (raw file):
// Check if EOF is returned at end or not. n, err = reader.Read(rb[:10]) require.Equal(t, err, io.EOF, "EOF should be returned at end")
Can you add another check
require.Zero(t, n, "read should return zero bytes")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 15 unresolved discussions (waiting on @ashish-goswami, @manishrjain, and @pullrequest[bot])
y/y.go, line 381 at r10 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
I don't think we should do this. The caller of
write
would never expect the passed sliced to be modified in any way. The documentation forio.Writer
interface clearly statesWrite must not modify the slice data, even temporarily.
My bad. The underlying data slice is not being modified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 4 of 6 files reviewed, 15 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @pullrequest[bot])
y/y.go, line 361 at r9 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
nextPageSize
Done.
y/y.go, line 392 at r10 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
I don't understand the comment. What can be changed later?
Removed it.
y/y_test.go, line 42 at r10 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
just
offor verifying the result.
Done.
y/y_test.go, line 144 at r10 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
This comment looks outdated.
Done.
y/y_test.go, line 152 at r10 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
This comment looks outdated.
Done.
y/y_test.go, line 158 at r10 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Can you add another check
require.Zero(t, n, "read should return zero bytes")
Done.
Ran this on both branch
Master
This PR
Benchstats
This change is