-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat!: Use pandas custom data types for BigQuery DATE and TIME columns, remove date_as_object
argument
#972
Merged
Merged
Changes from 60 commits
Commits
Show all changes
87 commits
Select commit
Hold shift + click to select a range
849f2c0
feat: add pandas time arrays to hold BigQuery TIME data
jimfulton 99b43b6
remove commented/aborted base class and simplify super call in __init__
jimfulton f722570
dtypes tests pass with pandas 0.24.2
jimfulton 724f62f
blacken and lint
jimfulton 9b03449
Added DateArray and generalized tests to handle dates and times
jimfulton 6af561d
blacken
jimfulton 7fc299c
handle bad values passed to areay construction
jimfulton 869f5f7
Add null/None handling
jimfulton 5505069
Added repeat and copy tests
jimfulton 15ca7e1
don't use extract_array and make _from_sequence_of_strings and alias …
jimfulton 9392cb4
Summary test copying ndarray arg
jimfulton 04c1e6a
expand construction test to test calling class directly and calling _…
jimfulton f59169c
blacken
jimfulton c33f608
test size and shape
jimfulton 41bbde6
Enable properties for pandas 1.3 and later
jimfulton e9ed1c4
Test more pandas versions.
jimfulton d6d81fe
Updated import_default with an option to force use of the default
jimfulton a8697f3
simplified version-checking code
jimfulton 8ea43c5
_from_factorized
jimfulton 2f37929
fix small DRY violation for parametrization by date tand time dtypes
jimfulton ad0e3c0
isna
jimfulton c1ebb5c
take
jimfulton eaa2e96
_concat_same_type
jimfulton 82a2d84
fixed __getitem__ to handle array indexes
jimfulton 6f4178f
test __getitem__ w arrau index and dropna
jimfulton d8818e0
fix assignment with array keys and fillna test
jimfulton 6bfb75b
reminder (for someone :) ) to override some abstract implementations …
jimfulton 3364bdd
unique test
jimfulton 661c6b2
test argsort
jimfulton ac7330c
fix version in constraint
jimfulton 9b7a72c
blacken/lint
jimfulton 47b0756
stop fighting the framework and store as ns
jimfulton 48f2e11
blacken/lint
jimfulton b1025b7
Implement astype to fix Python 3.7 failure
jimfulton 7acdb05
test assigning None
jimfulton 731634e
test astype self type w copy
jimfulton 903e23c
test a concatenation dtype sanity check
jimfulton e52c65d
Added conversion of date to datetime
jimfulton 74ef1b0
convert times to time deltas
jimfulton 91d9e2b
Use new pandas date and time dtypes
jimfulton 517307c
Get rid of date_as_object argument
jimfulton 711cfaf
fixed a comment
jimfulton 7cdea07
added *unit* test for dealimng with dates and timestamps that can't f…
jimfulton a20f67b
Removed brittle hack that enabled series properties.
jimfulton 96ed76a
Add note on possible zero-copy optimization for dates
jimfulton 159c202
Implemented any, all, min, max and median
jimfulton a81b26e
make pytype happy
jimfulton 98b3603
test (and fix) load from dataframe with date and time columns
jimfulton 4f71c9c
Make sure insert_rows_from_dataframe works
jimfulton 5c25ba4
Renamed date and time dtypes to bqdate and bqtime
jimfulton c6dabe2
make fallback date and time dtype names strings to make pytype happy
jimfulton 7021601
date and time arrays implement __arrow_array__
jimfulton 2585dbc
Document new dtypes
jimfulton 77c1c9e
blacken/lint
jimfulton 4261b80
Make conversion of date columns from arrow pandas outout to pandas ze…
jimfulton 2671718
Added date math support
jimfulton 36eb58c
fix end tag
jimfulton a8d0cb0
fixed snippet ref
jimfulton 3eabca3
Support date math with DateOffset scalars
jimfulton b81b996
use db-dtypes
jimfulton dad0d36
Include db-dtypes in sample requirements
jimfulton 35df752
Updated docs and snippets for db-dtypes
jimfulton ba776f6
gaaaa, missed a bqdate
jimfulton 10555e1
Merge branch 'v3' into dtypes-v3
parthea fc451f5
Merge remote-tracking branch 'upstream/v3' into dtypes-v3
tswast 08d1b70
move db-dtypes samples to db-dtypes docs
tswast d3b57e0
update to work with db-dtypes 0.2.0+
tswast 0442789
update dtype names in system tests
tswast c7ff18e
comment with link to arrow data types
tswast b99fa5f
update db-dtypes version
tswast 8253559
experiment with direct pyarrow to extension array conversion
tswast c8b5d67
Merge branch 'v3' into dtypes-v3
tswast 4a6ec3e
use types mapper for dbdate
tswast 4d5d229
Merge branch 'dtypes-v3' of github.com:jimfulton/python-bigquery into…
tswast 3953ead
fix constraints
tswast c4a1f2c
use types mapper where possible
tswast d7e7c5b
always use types mapper
tswast eec4103
adjust unit tests to use arrow not avro
tswast 69deb6f
avoid "ValueError: need at least one array to concatenate" with empty…
tswast 732fe86
link to pandas issue
tswast 2cd97a1
remove unnecessary variable
tswast 852d5a8
Update tests/unit/job/test_query_pandas.py
tswast ea5d254
Update tests/unit/job/test_query_pandas.py
tswast 1d8adb1
add missing db-dtypes requirement
tswast ef8847d
Merge branch 'dtypes-v3' of github.com:jimfulton/python-bigquery into…
tswast 25a8f75
avoid arrow_schema on older versions of bqstorage
tswast 0d81fa1
Merge remote-tracking branch 'upstream/v3' into dtypes-v3
tswast File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than a special check for the date dtype, I wonder if we can use the
types_mapper
parameter onto_pandas
?https://arrow.apache.org/docs/python/generated/pyarrow.RecordBatch.html#pyarrow.RecordBatch.to_pandas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it should be possible, but we'll need some additional work on the db-dtypes package. googleapis/python-db-dtypes-pandas#38