Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gensupport: Allow user to provide his own buffer used for uploading #632

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

zimnx
Copy link

@zimnx zimnx commented Aug 20, 2020

When uploading lots of files using ObjectStorage.Insert(), each file allocates his own buffer. This results in increased CPU usage and high RSS memory.

WithBuffer allows user to provide his own buffer which will be used for streaming.

Graph below shows how many allocations were done during upload of ~30k of 10MB files:
Selection_132
With this change, there're no allocations at all on library side (pool.New is code on caller side)
Selection_135

@zimnx zimnx requested a review from a team as a code owner August 20, 2020 15:32
@google-cla
Copy link

google-cla bot commented Aug 20, 2020

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added the cla: no This human has *not* signed the Contributor License Agreement. label Aug 20, 2020
@zimnx
Copy link
Author

zimnx commented Aug 20, 2020

@googlebot I signed it!

@google-cla google-cla bot added cla: yes This human has signed the Contributor License Agreement. and removed cla: no This human has *not* signed the Contributor License Agreement. labels Aug 20, 2020
@codyoss
Copy link
Member

codyoss commented Aug 20, 2020

@zimnx Can you please create an issue for this PR. We like to discuss features on issues before implementing them. Thanks!

Copy link
Contributor

@tritone tritone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. I'd also appreciate a feature request issue opened so we can discuss on there.

This seems like a good start to this feature, but since these changes are only for gensupport it would require additional work to expose this option through the generated clients and subsequently the manual google-cloud-go clients.

@zimnx
Copy link
Author

zimnx commented Aug 25, 2020

Added #638

@zimnx zimnx force-pushed the mz/gcp branch 2 times, most recently from 42cd33c to e5cfc33 Compare August 28, 2020 08:49
@zimnx
Copy link
Author

zimnx commented Aug 28, 2020

any chance to merge it soon? Other MediaOptions are not directly used in generated clients, there are only mentions about them in comments around Media function where they can be used.

@tritone
Copy link
Contributor

tritone commented Aug 28, 2020

any chance to merge it soon? Other MediaOptions are not directly used in generated clients, there are only mentions about them in comments around Media function where they can be used.

Ahh I see I was mistaken about that-- so it will just be the manual client writer that will need to be updated for storage (and maybe bigquery).

I'm going to do a bit more discussion with my colleagues and let you know. You'd also definitely have to add a test for the new media option to

func TestNewInfoFromMedia(t *testing.T) {
(but no need to do that now, let me make sure I have approval on this direction first).

internal/gensupport/buffer.go Show resolved Hide resolved
internal/gensupport/media_test.go Show resolved Hide resolved
googleapi/googleapi.go Outdated Show resolved Hide resolved
googleapi/googleapi.go Show resolved Hide resolved
@zimnx zimnx force-pushed the mz/gcp branch 2 times, most recently from e66a82a to ecefbdc Compare September 1, 2020 09:15
@zimnx
Copy link
Author

zimnx commented Sep 14, 2020

Hi folks, any updates regarding merging this? :)

@zimnx
Copy link
Author

zimnx commented Sep 28, 2020

@tritone @codyoss kind reminder

mauri870 added a commit to mauri870/patches that referenced this pull request Dec 5, 2020
This patch adds the hability to specify a user-defined buffer to be
used for file uploads to Google Storage in order to reduce memory
allocations. The user needs to keep track of this buffer somehow,
normally with sync.Buffer, and ensure it's not used by multiple
goroutines.

This patch depends on this one from googleapis:
googleapis/google-api-go-client#632
@danp
Copy link

danp commented Jan 30, 2023

👋 Regularly seeing high heap/allocations which this PR could help. Anything else needed to get this merged?

@danp
Copy link

danp commented Jun 12, 2023

If it helps anyone: we "fixed" this by setting ChunkSize=0 on all our Writers, which disables use of the buffer entirely. Since we do our own retries elsewhere that seems fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants