Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery: [Primitive-ize API] SchemaField #8196

Closed
max-sixty opened this issue May 30, 2019 · 6 comments · Fixed by #9550
Closed

BigQuery: [Primitive-ize API] SchemaField #8196

max-sixty opened this issue May 30, 2019 · 6 comments · Fixed by #9550
Assignees
Labels
api: bigquery Issues related to the BigQuery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@max-sixty
Copy link

Consistent with the recent work to allow the API to accept some primitives as well as specific objects, the schema argument to LoadJobConfig* could accept a list of dicts. Currently it requires constructing a bunch of SchemaField objects:

    schema = [
        # could be just column_dicts
        bigquery.schema.SchemaField.from_api_repr(column_dict) for column_dict in column_dicts
    ]

     with open(empty_path, "rb") as source_file: 
        job_config = bigquery.job.LoadJobConfig(
            schema=schema, write_disposition="WRITE_TRUNCATE"
        )

        job = gbq_client.load_table_from_file(
            source_file,
            table_ref,
            location="US",
            job_config=job_config,
        )

We're trying to move to using the BQ Python API rather than subprocessing out to a shell with bq; it's got much better but it's still a bit Java-esque, and these are examples of times it's a tougher sell than constructing a string a sending it to bash. As ever, lmk if the API isn't designed for these cases and you'd encourage users to use bash.

*this could also be a dict, though less of an imperative.

@max-sixty
Copy link
Author

A good example of an API I think is nice and pythonic is the Kubernetes python API - for each object you can either supply:

  • An object (of which there are literally hundreds...)
  • A primitive, such as a dict / string
    ...and the library will do the appropriate coercions

@tswast tswast added api: bigquery Issues related to the BigQuery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. labels May 31, 2019
@tswast
Copy link
Contributor

tswast commented May 31, 2019

We did recently add a slightly higher-level method for converting from JSON schema to list of SchemaField. I think it's reasonable to automatically convert in some of the classes/methods that currently accept a list of SchemaField.

Might be a little bit tricky, though, since we can't do type inference since they'd both be lists.

@max-sixty
Copy link
Author

We did recently add a slightly higher-level method for converting from JSON schema to list of SchemaField

Nice, thanks. What's the method?

we can't do type inference since they'd both be lists.

Yes, good point. I think the K8s API rebuilds the whole graph of objects, serializing / de-serializing where needed. That's not realistic to start, though. A hand-written imperative check doesn't seem satisfying either...

@tswast
Copy link
Contributor

tswast commented May 31, 2019

Client.schema_from_json(file_or_path) is the related method.

Not quite what you're asking for though, since it's for files.

Edit: updated link to googleapis.dev

@max-sixty
Copy link
Author

Perfect, thanks!

@tswast
Copy link
Contributor

tswast commented May 31, 2019

There's even precedent for accepting a JSON input in even the gRPC-based google-cloud-python libraries, so I'm hopeful there's a way we can do this in general. Maybe wherever we use to-api-repr in the client, try to do that and catch attribute error. If attribute error, treat it as the resource itself. Could make a good function in _helpers.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants