Skip to content

Support INTERVAL data type in DB-API, arrow, and pandas connectors #836

@tswast

Description

@tswast

Follow-up to #826 since pandas and Arrow do not yet have a structured type that aligns with INTERVAL. The existing Timedelta support would work for INTERVALS with only a time component, but it is not calendar aware, so supporting year, month, and day intervals would require some mapping to timedelta, which is not ideal.

Why is a new data type needed?

  • YEAR: Leap years are a thing. Not every year is 365 days long.
  • MONTH: Not every month is the same length.
  • DAY: Daylight savings is a thing. Not every day is 24 hours long.

Note: DB-API support is included here because it uses the BigQuery Storage API, where we use the Arrow wire format.

TODO:

  • Auto-detect data type in DB-API query parameters
    • Might be possible to do this before reading INTERVAL columns is supported.
  • Row data is converted to relevant type in DB-API
  • Row data is converted to relevant type in to_dataframe
    • Might need to be object, since timedelta64 doesn't have years/months.
  • Check if to_arrow type is expected datatype
  • Convert data type in insert_rows_from_dataframe
  • Convert data type in load_rows_from_dataframe (CSV)
  • Convert data type in load_rows_from_dataframe (Parquet)

Metadata

Metadata

Assignees

Labels

api: bigqueryIssues related to the googleapis/python-bigquery API.externalThis issue is blocked on a bug with the actual product.status: blockedResolving the issue is dependent on other work.type: feature request‘Nice-to-have’ improvement, new feature or different behavior or design.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions