-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
disambiguate missing policy tags from explicitly unset policy tags #981
Comments
I manually tested, and it appears that empty policy tags are OK. from google.cloud import bigquery
client = bigquery.Client()
load_job_config = bigquery.LoadJobConfig()
load_job_config.schema = [
bigquery.SchemaField("name", "STRING"),
bigquery.SchemaField("age", "INTEGER"),
]
load_job = client.load_table_from_json([
{"name": "Adam", "age": 46}
], "swast-scratch.my_dataset.names_with_policy_tags",
job_config=load_job_config)
load_job.result() Trying to set the policy tags to exactly the value they already are still fails, though: from google.cloud import bigquery
client = bigquery.Client()
load_job_config = bigquery.LoadJobConfig()
load_job_config.schema = [
bigquery.SchemaField.from_api_repr({
"name": "name",
"type": "STRING",
"mode": "NULLABLE",
"policyTags": {
"names": [
"projects/swast-scratch/locations/us/taxonomies/5039374947791552873/policyTags/5641276241825860924"
]
}
}),
bigquery.SchemaField("age", "INTEGER"),
]
load_job = client.load_table_from_json([
{"name": "Adam", "age": 46}
], "swast-scratch.my_dataset.names_with_policy_tags",
job_config=load_job_config)
load_job.result() Error:
I'd still be more comfortable if we omitted the |
Uh oh, I can confirm that this code sample accidentally unsets the policy tags, which is very bad. from google.cloud import bigquery
client = bigquery.Client()
table = bigquery.Table("swast-scratch.my_dataset.names_with_policy_tags")
table.schema = [
bigquery.SchemaField("name", "STRING"),
bigquery.SchemaField("age", "INTEGER"),
]
client.update_table(table, ["schema"]) |
Compare with from google.cloud import bigquery
client = bigquery.Client()
table = bigquery.Table("swast-scratch.my_dataset.names_with_policy_tags")
table._properties["schema"] = {
"fields": [
{"name": "name", "type": "STRING"},
{"name": "age", "type": "INTEGER"},
]
}
client.update_table(table, ["schema"]) which does not reset |
From some more manual testing with #983, for table update there isn't a difference between explicit That said, I think it still makes sense to disambiguate explicit |
Re: "Add a system test to avoid future regressions" this is a bit difficult, as it requires some external "taxonomy" resources in Data Catalog. Instead, I've added comments to the relevant unit tests to describe the expected behavior and hopefully avoid regressions. |
…gs (#983) Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly: - [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [ ] Ensure the tests and linter pass - [ ] Code coverage does not decrease (if any source code was changed) - [ ] Appropriate docs were updated (if necessary) Fixes #981 Fixes #982 Towards googleapis/python-bigquery-pandas#387 🦕
…gs (googleapis#983) Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly: - [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [ ] Ensure the tests and linter pass - [ ] Code coverage does not decrease (if any source code was changed) - [ ] Appropriate docs were updated (if necessary) Fixes googleapis#981 Fixes googleapis#982 Towards googleapis/python-bigquery-pandas#387 🦕
…gs (googleapis#983) Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly: - [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [ ] Ensure the tests and linter pass - [ ] Code coverage does not decrease (if any source code was changed) - [ ] Appropriate docs were updated (if necessary) Fixes googleapis#981 Fixes googleapis#982 Towards googleapis/python-bigquery-pandas#387 🦕
As seen in internal bug 182204971, #557, if the
policyTags
key is included in a load job, the user can get403 POST ... User does not have permission to get taxonomy ...
errors when uploading data, even if they have write access to the table.Based on the behavior seen in googleapis/python-bigquery-pandas#387, I believe this fix has accidentally been reverted in #703 (comment).
TODO:
Add a system test to avoid future regressionsSchemaField
to allow disambiguating unset vs explicitly set to empty list / none.The text was updated successfully, but these errors were encountered: