Skip to content

feat: add job_retry argument to load_table_from_uri #969

Open
@tswast

Description

@tswast

In internal issue 195911158, a customer is struggling to retry jobs that fail with "403 Exceeded rate limits: too many table update operations for this table". One can encounter this exception by attempting to run hundreds of load jobs in parallel.

Thoughts:

  1. Try to reproduce. Does the exception happen at result() or load_table_from_uri()? If result(), continue with job_retry, otherwise see if we can modify the default retry predicate for load_table_from_uri() to find this rate limiting reason and retry.
  2. Assuming the exception does happen at result(), modify load jobs (or more likely the base class) to retry if job retry is set, similar to what we do for query jobs.

Notes:

  • I suspect we'll need a different default job_retry object for load_table_from_uri(), as the retryable reasons will likely be different than what we have for queries.
  • I don't think the other load_table_from_* are as retryable as load_table_from_uri(), since they would require rewinding file objects, which isn't always possible. We'll probably want to consider adding job_retry to those load job methods in the future, but for now load_table_from_uri is what's needed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    api: bigqueryIssues related to the googleapis/python-bigquery API.type: feature request‘Nice-to-have’ improvement, new feature or different behavior or design.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions