Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery: Change the default value of Cursor instances' arraysize attribute to None #9199

Merged
merged 2 commits into from
Sep 12, 2019

Conversation

plamut
Copy link
Contributor

@plamut plamut commented Sep 10, 2019

Closes #9185.

This PR adds a note on arraysize parameter, and its impact on the fetchall() method performance.

To discuss

  • Following a comment on the issue, should we reanimate the fetchmany()'s size parameter, too?
    No (for the time being), as that would require implementing a custom pagination logic in the client (comment).

How to test

Run the code sample from the issue description, and verify that by setting cursor.arraysize to appropriate value avoids the reported performance issue, which what the updated docs point out.
Run the code sample from the issue description, and verify that the new default value of cursor.arraysize avoids the reported performance issue. Or setting that value manually to something appropriate.

@plamut plamut added api: bigquery Issues related to the BigQuery API. type: docs Improvement to the documentation for an API. labels Sep 10, 2019
@plamut plamut requested a review from a team September 10, 2019 13:01
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Sep 10, 2019
@tswast
Copy link
Contributor

tswast commented Sep 10, 2019

completely forgot about the arraysize parameter!

Rather than add this note, I think it may make more sense to diverge from the DB-API spec (which dictates a default arraysize=1).

# Per PEP 249: The arraysize attribute defaults to 1, meaning to fetch
# a single row at a time.
self.arraysize = 1

If we instead default to None, the BigQuery API backend can choose a more appropriate page size automatically.

@plamut
Copy link
Contributor Author

plamut commented Sep 10, 2019

If we instead default to None, the BigQuery API backend can choose a more appropriate page size automatically.

I see, that actually makes sense. It's safer than just having it in the docs, as some portion of the users will probably still overlook it.

@plamut plamut added needs work This is a pull request that needs a little love. and removed type: docs Improvement to the documentation for an API. labels Sep 10, 2019
@plamut plamut requested a review from tswast September 11, 2019 18:21
@plamut plamut removed the needs work This is a pull request that needs a little love. label Sep 11, 2019
@plamut plamut changed the title BigQuery: Add performance note to fetchall() docs BigQuery: Change the default value of Cursor instances' arraysize attribute to None Sep 11, 2019
Let the backend pick the most appropriate size automatically, instead
of enforcing the size of 1 on it (despite thise being a deviation from
PEP 249).
@plamut plamut merged commit 69ff0bf into googleapis:master Sep 12, 2019
emar-kar pushed a commit to MaxxleLLC/google-cloud-python that referenced this pull request Sep 18, 2019
…ribute to None (googleapis#9199)

* Add performance note to fetchall() docs

* Set default cursor arraysize to None

Let the backend pick the most appropriate size automatically, instead
of enforcing the size of 1 on it (despite thise being a deviation from
PEP 249).
emar-kar pushed a commit to MaxxleLLC/google-cloud-python that referenced this pull request Sep 18, 2019
…ribute to None (googleapis#9199)

* Add performance note to fetchall() docs

* Set default cursor arraysize to None

Let the backend pick the most appropriate size automatically, instead
of enforcing the size of 1 on it (despite thise being a deviation from
PEP 249).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BigQuery: DB-API is very slow
3 participants