Open
Description
The method load_table_from_json()
shows up in a number of Issues and PRs as a source of problems OR ambiguity.
Examine this code and determine whether it (and the similar methods load_table_from_*
) should be refactored to remove issues and concerns and/or improve usability.
Some of the issues/PRs where load_table_from_json()
or similar methods are mentioned include:
ISSUES:
Schema and Autodetect
- load_table_from_dataframe should honor default project and dataset. #843
- Autodetect for BQ happening automatically even with schema defined. #847
- Inability to cast int to string when appending data to table using load_table_from_json #906
- load_table_from_json interpolates string as int #1228
- BigQuery seems to automatically convert STRING to BYTES if STRING > 186 bytes #1563
- ALLOW_FIELD_ADDITION not working #1095
- Allow load_table_from_dataframe to Ignore Extra Schema Fields #1812
Retries
- feat: add
job_retry
argument toload_table_from_uri
#969 - Remove num_retries parameter from load_table_from_*() methods #1071
Null Values
- load_table_from_dataframe does not error out when
nan
in a required column - Million dollar bug #1692 - bigquery load_table_from_dataframe from string type will show null values #1737
Default Value
Misc
- some emoji will become wrong character stored in bigquery while using load_table_from_json #864
- disambiguate missing policy tags from explicitly unset policy tags #981
- Unable to load table from dataframe with overlapping index/column name #1543
- Provide json.dumps kwargs to load_table_from_json #1564
- Client.load_table_from_dataframe() sometimes chooses invalid column type #1650
- Support pyarrow.large_* as column type in dataframe upload/ download #1706
- load_table_toDataframe breaks with Arrow list fields when the list is backed by a ChunkedArray. #1808