Skip to content

Releases: googleapis/python-bigquery-dataframes

v2.4.0

12 May 21:59
7961681
Compare
Choose a tag to compare

2.4.0 (2025-05-12)

Features

  • Add "dayofyear" property for dt accessors (#1692) (9d4a59d)
  • Add .dt.days, .dt.seconds, dt.microseconds, and dt.total_seconds() for timedelta series. (#1713) (2b3a45f)
  • Add DatetimeIndex class (#1719) (c3c830c)
  • Add isocalendar() for dt accessor" (#1717) (0479763)
  • Add bigframes.bigquery.json_value (#1697) (46a9c53)
  • Add blob.exif function support (#1703) (3f79528)
  • Add inplace arg support to sort methods (#1710) (d1ccb52)
  • Improve error message in Series.apply for direct udfs (#1673) (1a658b2)
  • Publish bigframes blob(Multimodal) to preview (#1693) (e4c85ba)
  • Support () operator between timedeltas (#1702) (edaac89)
  • Support forecast_limit_lower_bound and forecast_limit_upper_bound in ARIMA_PLUS (and ARIMA_PLUS_XREG) models (#1305) (b16740e)
  • Support to_strip parameter for str.strip, str.lstrip and str.rstrip (#1705) (a84ee75)

Bug Fixes

  • Fix dayofyear doc test (#1701) (9b777a0)
  • Fix issues with chunked arrow data (#1700) (e3289b7)
  • Rename columns with protected names such as _TABLE_SUFFIX in to_gbq() (#1691) (8ec6079)

Performance Improvements

Dependencies

Documentation

  • Add snippets for Matrix Factorization tutorials (#1630) (24b37ae)
  • Deprecate bpd.options.bigquery.allow_large_results in favor of bpd.options.compute.allow_large_results (#1597) (18780b4)
  • Include import statement in the bigframes code snippet (#1699) (08d70b6)
  • Include the clean-up step in the udf code snippet (#1698) (48992e2)
  • Move multimodal notebook out of experimental folder (#1712) (68b6532)
  • Update blob_display option in snippets (#1714) (8b30143)

v2.3.0

06 May 17:16
1ed9d46
Compare
Choose a tag to compare

2.3.0 (2025-05-06)

Features

  • Add dry_run parameter to read_gbq(), read_gbq_table() and read_gbq_query() (#1674) (4c5dee5)

Bug Fixes

  • Guarantee guid thread safety across threads (#1684) (cb0267d)
  • Support large lists of lists in bpd.Series() constructor (#1662) (0f4024c)
  • Use value equality to check types for unix epoch functions and timestamp diff (#1690) (81e8fb8)

Performance Improvements

  • to_datetime() now avoids caching inputs unless data is inspected to infer format (#1667) (dd08857)

Documentation

  • Add a visualization notebook to BigFrame samples (#1675) (ee062bf)
  • Fix spacing of k-means code snippet (#1687) (99f45dd)
  • Update snippet for Create a k-means model tutorial (#1664) (761c364)

v2.2.0

01 May 00:00
f3fd7e2
Compare
Choose a tag to compare

2.2.0 (2025-04-30)

Features

  • Add gemini-2.0-flash-001 and gemini-2.0-flash-lite-001 to fine tune score endponts and multimodal endpoints (#1650) (4fb54df)
  • Add GeminiTextGenerator.predict structured output (#1653) (6199023)
  • DataFrames.getitem support for slice input (#1668) (563f0cb)
  • Print right origin of PreviewWarning for the bpd.udf (#1629) (48d10d1)
  • Session.bytes_processed_sum will be updated when allow_large_re… (#1669) (ae312db)
  • Short circuit query for local scan (#1618) (e84f232)
  • Support names parameter in read_csv for bigquery engine (#1659) (3388191)
  • Support passing list of values to bigframes.core.sql.simple_literal (#1641) (102d363)
  • Support write api as loading option (#1617) (c46ad06)

Bug Fixes

  • DataFrame accessors is not pupulated (#1639) (28afa2c)
  • Prefer remote schema instead of throwing on materialize conflicts (#1644) (53fc25b)
  • Remove itertools.pairwise usage (#1638) (9662745)
  • Resolve issue where pre-release versions of google-auth are installed (#1491) (ebb7a5e)
  • Resolve some of the typo errors (#1655) (cd7fbde)

Performance Improvements

Dependencies

Documentation

  • Add JSON data types notebook (#1647) (9128c4a)
  • Add sample code snippets for udf (#1649) (53caa8d)
  • Fix bq_dataframes_template notebook to work if partial ordering mode is enabled (#1665) (f442e7a)
  • Note that udf is in preview and must be python 3.11 compatible (#1629) (48d10d1)

v2.1.0

22 Apr 16:38
8713950
Compare
Choose a tag to compare

2.1.0 (2025-04-22)

Features

  • Add bigframes.bigquery.st_distance function (#1637) (bf1ae70)
  • Enable local json string validations (#1614) (233347a)
  • Enhance read_csv index_col parameter support (#1631) (f4e5b26)

Bug Fixes

  • Add retry for test_clean_up_via_context_manager (#1627) (58e7cb0)
  • Improve robustness of managed udf code extraction (#1634) (8cc56d5)

Documentation

  • Add code samples in the udf API docstring (#1632) (f68b80c)

v2.0.0

17 Apr 19:46
881e4f0
Compare
Choose a tag to compare

2.0.0 (2025-04-17)

⚠ BREAKING CHANGES

  • make dataset and name params mandatory in udf (#1619)
  • Locational endpoints support is not available in BigFrames 2.0.
  • change default LLM model to gemini-2.0-flash-001, drop PaLM2TextGenerator and PaLM2TextEmbeddingGenerator (#1558)
  • change default ingress setting for remote_function to internal-only (#1544)
  • make remote_function params keyword only (#1537)
  • make remote_function default service account explicit (#1537)
  • set allow_large_results=False by default (#1541)

Features

  • Add on parameter in dataframe.rolling() and dataframe.groupby.rolling() (#1556) (45c9d9f)
  • Add component to manage temporary tables (#1559) (0a4e245)
  • Add Series.to_pandas_batches() method (#1592) (09ce979)
  • Add support for creating a Matrix Factorization model (#1330) (b5297f9)
  • Allow input_types, output_type, and dataset to be used positionally in remote_function (#1560) (bcac8c6)
  • Allow pandas.cut 'labels' parameter to accept a list of string (#1549) (af842b1)
  • Change default ingress setting for remote_function to internal-only (#1544) (c848a80)
  • Detect duplicate column/index names in read_gbq before send query. (#1615) (40d6960)
  • Drop support for locational endpoints (#1542) (4bf2e43)
  • Enable time range rolling for DataFrame, DataFrameGroupBy and SeriesGroupBy (#1605) (b4b7073)
  • Improve local data validation (#1598) (815e471)
  • Make remote_function default service account explicit (#1537) (9eb9089)
  • Set allow_large_results=False by default (#1541) (e9fb712)
  • Support bigquery connection in managed function (#1554) (f6f697a)
  • Support bq connection path format (#1550) (e7eb918)
  • Support gemini-2.0-X models (#1558) (3104fab)
  • Support inlining small list, struct, json data (#1589) (2ce891f)
  • Support time range rolling on Series. (#1590) (6e98a2c)
  • Use session temp tables for all ephemeral storage (#1569) (9711b83)
  • Use validated local storage for data uploads (#1612) (aee4159)
  • Warn the deprecated max_download_size, random_state and sampling_method parameters in (DataFrame|Series).to_pandas() (#1573) (b9623da)

Bug Fixes

  • to_pandas_batches() respects page_size and max_results again (#1572) (27c5905)
  • Ensure page_size works correctly in to_pandas_batches when max_results is not set (#1588) (570cff3)
  • Include role and service account in IAM exception (#1564) (8c50755)
  • Make dataset and name params mandatory in udf (#1619) (637e860)
  • Pandas.cut returns labels index for numeric breaks when labels=False (#1548) (b2375de)
  • Prevent KeyError in bpd.concat with empty DF and struct/array types DF (#1568) (b4da1cf)
  • Read_csv supports for tilde local paths and includes index for bigquery_stream write engine (#1580) (352e8e4)
  • Use dictionaries to avoid problematic google.iam namespace (#1611) (b03e44f)

Performance Improvements

  • Directly read gbq table for simple plans (#1607) (6ad38e8)

Dependencies

Documentation

Read more

v2.0.0.dev0

31 Mar 13:41
Compare
Choose a tag to compare
v2.0.0.dev0 Pre-release
Pre-release

2.0.0.dev0 (2025-03-31)

⚠ BREAKING CHANGES

  • Locational endpoints support is not available in BigFrames 2.0.
  • change default LLM model to gemini-2.0-flash-001, drop PaLM2TextGenerator and PaLM2TextEmbeddingGenerator (#1558)
  • change default ingress setting for remote_function to internal-only (#1544)
  • make remote_function params keyword only (#1537)
  • make remote_function default service account explicit (#1537)
  • set allow_large_results=False by default (#1541)

Features

  • Add component to manage temporary tables (#1559) (0a4e245)
  • Allow input_types, output_type, and dataset to be used positionally in remote_function (#1560) (bcac8c6)
  • Allow pandas.cut 'labels' parameter to accept a list of string (#1549) (af842b1)
  • Change default ingress setting for remote_function to internal-only (#1544) (c848a80)
  • Drop support for locational endpoints (#1542) (4bf2e43)
  • Make remote_function default service account explicit (#1537) (9eb9089)
  • Set allow_large_results=False by default (#1541) (e9fb712)
  • Support bigquery connection in managed function (#1554) (f6f697a)
  • Support bq connection path format (#1550) (e7eb918)
  • Support gemini-2.0-X models (#1558) (3104fab)

Bug Fixes

  • Include role and service account in IAM exception (#1564) (8c50755)
  • Pandas.cut returns labels index for numeric breaks when labels=False (#1548) (b2375de)
  • Prevent KeyError in bpd.concat with empty DF and struct/array types DF (#1568) (b4da1cf)

Documentation

  • Add message to remove default model for version 3.0 (#1563) (910be2b)
  • Add warning for bigframes 2.0 (#1557) (3f0eaa1)
  • Remove gemini-1.5 deprecation warning for GeminiTextGenerator (#1562) (0cc6784)
  • Use restructured text to allow publishing to PyPI (#1565) (d1e9ec2)

Miscellaneous Chores

  • Make remote_function params keyword only (#1537) (9eb9089)

v1.42.0

27 Mar 07:46
b6b82ec
Compare
Choose a tag to compare

1.42.0 (2025-03-27)

Features

  • Add closed parameter in rolling() (#1539) (8bcc89b)
  • Add GeoSeries.difference() and bigframes.bigquery.st_difference() (#1471) (e9fe815)
  • Add GeoSeries.intersection() and bigframes.bigquery.st_intersection() (#1529) (8542bd4)
  • Add df.take and series.take (#1509) (7d00be6)
  • Add Linear_Regression.global_explain() (#1446) (7e5b6a8)
  • Allow iloc to support lists of negative indices (#1497) (a9cf215)
  • Support dry_run in to_pandas() (#1436) (75fc7e0)
  • Support window partition by geo column (#1512) (bdcb1e7)
  • Upgrade BQ managed udf to preview (#1536) (4a7fe4d)

Bug Fixes

  • Add deprecation warning to TextEmbeddingGenerator model, espeically gemini-1.0-X and gemini-1.5-X (#1534) (c93e720)
  • Change the default value for pdf extract/chunk (#1517) (a70a607)
  • Local data always has sequential index (#1514) (014bd33)
  • Read_pandas inline returns None when exceeds limit (#1525) (578081e)
  • Temporary fix for StreamingDataFrame not working backend bug (#1533) (6ab4ffd)
  • Tolerate BQ connection service account propagation delay (#1505) (6681f1f)

Performance Improvements

Documentation

  • Update GeoSeries.difference() and bigframes.bigquery.st_difference() docs (#1526) (d553fa2)

v1.41.0

19 Mar 19:38
0cdc874
Compare
Choose a tag to compare

1.41.0 (2025-03-19)

Features

  • Add support for the 'right' parameter in 'pandas.cut' (#1496) (8aff128)
  • Support BQ managed functions through read_gbq_function (#1476) (802183d)
  • Warn when the BigFrames version is more than a year old (#1455) (00e0750)

Bug Fixes

  • Fix pandas.cut errors with empty bins (#1499) (434fb5d)
  • Fix read_gbq with ORDER BY query and index_col set (#963) (de46d2f)

Performance Improvements

Documentation

v1.40.0

11 Mar 23:15
5273d36
Compare
Choose a tag to compare

1.40.0 (2025-03-11)

⚠ BREAKING CHANGES

  • reading JSON data as a custom arrow extension type (#1458)

Features

  • Reading JSON data as a custom arrow extension type (#1458) (e720f41)
  • Support list output for managed function (#1457) (461e9e0)

Bug Fixes

  • Fix list-like indexers in partial ordering mode (#1456) (fe72ada)
  • Fix the merge issue between 1424 and 1373 (#1461) (7b6e361)
  • Use == instead of is for timedelta type equality checks (#1480) (0db248b)

Performance Improvements

  • Compilation no longer bounded by recursion (#1464) (27ab028)

v1.39.0

05 Mar 20:03
c928920
Compare
Choose a tag to compare

1.39.0 (2025-03-05)

Features

  • (Preview) Support diff() for date series (#1423) (521e987)
  • (Preview) Support aggregations over timedeltas (#1418) (1251ded)
  • (Preview) Support arithmetics between dates and timedeltas (#1413) (962b152)
  • (Preview) Support automatic load of timedelta from BQ tables. (#1429) (b2917bb)
  • Add allow_large_results option to many I/O methods. Set to False to reduce latency (#1428) (dd2f488)
  • Add GeoSeries.boundary() (#1435) (32cddfe)
  • Add allow_large_results to peek (#1448) (67487b9)
  • Add groupby.rank() (#1433) (3a633d5)
  • Iloc multiple columns selection. (#1437) (ddfd02a)
  • Support interface for BigQuery managed functions (#1373) (2bbf53f)
  • Warn if default ingress_settings is used in remote_functions (#1419) (dfd891a)

Bug Fixes

  • Do not compare schema description during schema validation (#1452) (03a3a56)
  • Remove warnings for null index and partial ordering mode in prep for GA (#1431) (6785aee)
  • Warn if default cloud_function_service_account is used in remote_function (#1424) (fe7463a)
  • Window operations over JSON columns (#1451) (0070e77)
  • Write chunked text instead of dummy text for pdf chunk (#1444) (96b0e8a)

Performance Improvements

Documentation

  • Add snippet for explaining the linear regression model prediction (#1427) (7c37c7d)