Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid pulling entire result set into memory when constructing dataframe. #5870

Merged
merged 1 commit into from
Aug 30, 2018

Conversation

tseaver
Copy link
Contributor

@tseaver tseaver commented Aug 30, 2018

Closes #5859.

@tseaver tseaver added api: bigquery Issues related to the BigQuery API. performance labels Aug 30, 2018
@tseaver tseaver requested a review from tswast August 30, 2018 15:04
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Aug 30, 2018
@tseaver tseaver merged commit 1364a39 into master Aug 30, 2018
@tseaver tseaver deleted the 5859-bigquery-to_dataframe-reduce_footprint branch August 30, 2018 19:32
@max-sixty
Copy link

Just came across this. Ping us over at pandas-gbq if you have other performance thoughts / ideas

On this: I'm not sure this change will have that much impact, because pandas will coerce to in-memory python values regardless. Parsing each page and loading that into an array makes a big difference but I haven't implemented, in the hope that BQ has a step change in APIs from JSON-over-HTTP; full description: googleapis/python-bigquery-pandas#133 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API. cla: yes This human has signed the Contributor License Agreement. performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants