Skip to content

Latest commit

 

History

History
523 lines (407 loc) · 44.1 KB

CHANGELOG.md

File metadata and controls

523 lines (407 loc) · 44.1 KB

DataFusion Python Changelog

36.0.0 (2024-03-02)

Implemented enhancements:

  • feat: Add flatten array function #562 (mobley-trent)

Documentation updates:

  • docs: Add ASF attribution #580 (simicd)

Merged pull requests:

  • Allow PyDataFrame to be used from other projects #582 (andygrove)
  • docs: Add ASF attribution #580 (simicd)
  • Add array functions #560 (ongchi)
  • feat: Add flatten array function #562 (mobley-trent)

35.0.0 (2024-01-20)

Merged pull requests:

  • build(deps): bump syn from 2.0.41 to 2.0.43 #559 (dependabot[bot])
  • build(deps): bump tokio from 1.35.0 to 1.35.1 #558 (dependabot[bot])
  • build(deps): bump async-trait from 0.1.74 to 0.1.77 #556 (dependabot[bot])
  • build(deps): bump pyo3 from 0.20.0 to 0.20.2 #557 (dependabot[bot])

34.0.0 (2023-12-28)

Merged pull requests:

  • Adjust visibility of crate private members & Functions #537 (jdye64)
  • Update json.rst #538 (ray-andrew)
  • Enable mimalloc local_dynamic_tls feature #540 (jdye64)
  • Enable substrait feature to be built by default in CI, for nightlies … #544 (jdye64)

33.0.0 (2023-11-16)

Merged pull requests:

  • First pass at getting architectured builds working #350 (charlesbluca)
  • Remove libprotobuf dep #527 (jdye64)

32.0.0 (2023-10-21)

Implemented enhancements:

  • feat: expose PyWindowFrame #509 (dlovell)
  • add Binary String Functions;encode,decode #494 (jiangzhx)
  • add bit_and,bit_or,bit_xor,bool_add,bool_or #496 (jiangzhx)
  • add first_value last_value #498 (jiangzhx)
  • add regr_* functions #499 (jiangzhx)
  • Add random missing bindings #522 (jdye64)
  • Allow for multiple input files per table instead of a single file #519 (jdye64)
  • Add support for window function bindings #521 (jdye64)

Merged pull requests:

  • Prepare 31.0.0 release #500 (andygrove)
  • Improve release process documentation #505 (andygrove)
  • add Binary String Functions;encode,decode #494 (jiangzhx)
  • build(deps): bump mimalloc from 0.1.38 to 0.1.39 #502 (dependabot[bot])
  • build(deps): bump syn from 2.0.32 to 2.0.35 #503 (dependabot[bot])
  • build(deps): bump syn from 2.0.35 to 2.0.37 #506 (dependabot[bot])
  • Use latest DataFusion #511 (andygrove)
  • add bit_and,bit_or,bit_xor,bool_add,bool_or #496 (jiangzhx)
  • use DataFusion 32 #515 (andygrove)
  • add first_value last_value #498 (jiangzhx)
  • build(deps): bump regex-syntax from 0.7.5 to 0.8.1 #517 (dependabot[bot])
  • build(deps): bump pyo3-build-config from 0.19.2 to 0.20.0 #516 (dependabot[bot])
  • add regr_* functions #499 (jiangzhx)
  • Add random missing bindings #522 (jdye64)
  • build(deps): bump rustix from 0.38.18 to 0.38.19 #523 (dependabot[bot])
  • Allow for multiple input files per table instead of a single file #519 (jdye64)
  • Add support for window function bindings #521 (jdye64)
  • Small clippy fix #524 (andygrove)

31.0.0 (2023-09-12)

Full Changelog

Implemented enhancements:

  • feat: add case function (#447) #448 (mesejo)
  • feat: add compression options #456 (mesejo)
  • feat: add register_json #458 (mesejo)
  • feat: add basic compression configuration to write_parquet #459 (mesejo)
  • feat: add example of reading parquet from s3 #460 (mesejo)
  • feat: add register_avro and read_table #461 (mesejo)
  • feat: add missing scalar math functions #465 (mesejo)

Documentation updates:

  • docs: include pre-commit hooks section in contributor guide #455 (mesejo)

Merged pull requests:

  • Build Linux aarch64 wheel #443 (gokselk)
  • feat: add case function (#447) #448 (mesejo)
  • enhancement(docs): Add user guide (#432) #445 (mesejo)
  • docs: include pre-commit hooks section in contributor guide #455 (mesejo)
  • feat: add compression options #456 (mesejo)
  • Upgrade to DF 28.0.0-rc1 #457 (andygrove)
  • feat: add register_json #458 (mesejo)
  • feat: add basic compression configuration to write_parquet #459 (mesejo)
  • feat: add example of reading parquet from s3 #460 (mesejo)
  • feat: add register_avro and read_table #461 (mesejo)
  • feat: add missing scalar math functions #465 (mesejo)
  • build(deps): bump arduino/setup-protoc from 1 to 2 #452 (dependabot[bot])
  • Revert "build(deps): bump arduino/setup-protoc from 1 to 2 (#452)" #474 (viirya)
  • Minor: fix wrongly copied function description #497 (viirya)
  • Upgrade to Datafusion 31.0.0 #491 (judahrand)
  • Add isnan and iszero #495 (judahrand)

30.0.0

  • Skipped due to a breaking change in DataFusion

29.0.0

  • Skipped

28.0.0 (2023-07-25)

Implemented enhancements:

  • feat: expose offset in python API #437 (cpcloud)

Merged pull requests:

  • File based input utils #433 (jdye64)
  • Upgrade to 28.0.0-rc1 #434 (andygrove)
  • Introduces utility for obtaining SqlTable information from a file like location #398 (jdye64)
  • feat: expose offset in python API #437 (cpcloud)
  • Use DataFusion 28 #439 (andygrove)

27.0.0 (2023-07-03)

Merged pull requests:

  • LogicalPlan.to_variant() make public #412 (jdye64)
  • Prepare 27.0.0 release #423 (andygrove)

26.0.0 (2023-06-11)

Full Changelog

Merged pull requests:

  • Add Expr::Case when_then_else support to rex_call_operands function #388 (jdye64)
  • Introduce BaseSessionContext abstract class #390 (jdye64)
  • CRUD Schema support for BaseSessionContext #392 (jdye64)
  • CRUD Table support for BaseSessionContext #394 (jdye64)

25.0.0 (2023-05-23)

Full Changelog

Merged pull requests:

  • Prepare 24.0.0 Release #376 (andygrove)
  • build(deps): bump uuid from 1.3.1 to 1.3.2 #359 (dependabot[bot])
  • build(deps): bump mimalloc from 0.1.36 to 0.1.37 #361 (dependabot[bot])
  • build(deps): bump regex-syntax from 0.6.29 to 0.7.1 #334 (dependabot[bot])
  • upgrade maturin to 0.15.1 #379 (Jimexist)
  • Expand Expr to include RexType basic support #378 (jdye64)
  • Add Python script for generating changelog #383 (andygrove)

24.0.0 (2023-05-09)

Full Changelog

Documentation updates:

  • Fix link to user guide #354 (andygrove)

Merged pull requests:

  • Add interface to serialize Substrait plans to Python Bytes. #344 (kylebrooks-8451)
  • Add partition_count property to ExecutionPlan. #346 (kylebrooks-8451)
  • Remove unsendable from all Rust pyclass types. #348 (kylebrooks-8451)
  • Fix link to user guide #354 (andygrove)
  • Fix SessionContext execute. #353 (kylebrooks-8451)
  • Pub mod expr in lib.rs #357 (jdye64)
  • Add benchmark derived from TPC-H #355 (andygrove)
  • Add db-benchmark #365 (andygrove)
  • First pass of documentation in mdBook #364 (MrPowers)
  • Add 'pub' and '#[pyo3(get, set)]' to DataTypeMap #371 (jdye64)
  • Fix db-benchmark #369 (andygrove)
  • Docs explaining how to view query plans #373 (andygrove)
  • Improve db-benchmark #372 (andygrove)
  • Make expr member of PyExpr public #375 (jdye64)

23.0.0 (2023-04-23)

Full Changelog

Merged pull requests:

  • Improve API docs, README, and examples for configuring context #321 (andygrove)
  • Osx build linker args #330 (jdye64)
  • Add requirements file for python 3.11 #332 (r4ntix)
  • mac arm64 build #338 (andygrove)
  • Add conda.yaml baseline workflow file #281 (jdye64)
  • Prepare for 23.0.0 release #335 (andygrove)
  • Reuse the Tokio Runtime #341 (kylebrooks-8451)

22.0.0 (2023-04-10)

Full Changelog

Merged pull requests:

  • Fix invalid build yaml #308 (andygrove)
  • Try fix release build #309 (andygrove)
  • Fix release build #310 (andygrove)
  • Enable datafusion-substrait protoc feature, to remove compile-time dependency on protoc #312 (andygrove)
  • Fix Mac/Win release builds in CI #313 (andygrove)
  • install protoc in docs workflow #314 (andygrove)
  • Fix documentation generation in CI #315 (andygrove)
  • Source wheel fix #319 (andygrove)

21.0.0 (2023-03-30)

Full Changelog

Merged pull requests:

  • minor: Fix minor warning on unused import #289 (viirya)
  • feature: Implement describe() method #293 (simicd)
  • fix: Printed results not visible in debugger & notebooks #296 (simicd)
  • add package.include and remove wildcard dependency #295 (andygrove)
  • Update main branch name in docs workflow #303 (andygrove)
  • Upgrade to DF 21 #301 (andygrove)

20.0.0 (2023-03-17)

Full Changelog

Implemented enhancements:

  • Empty relation bindings #208 (jdye64)
  • wrap display_name and canonical_name functions #214 (jdye64)
  • Add PyAlias bindings #216 (jdye64)
  • Add bindings for scalar_variable #218 (jdye64)
  • Bindings for LIKE type expressions #220 (jdye64)
  • Bool expr bindings #223 (jdye64)
  • Between bindings #229 (jdye64)
  • Add bindings for GetIndexedField #227 (jdye64)
  • Add bindings for case, cast, and trycast #232 (jdye64)
  • add remaining expr bindings #233 (jdye64)
  • feature: Additional export methods #236 (simicd)
  • Add Python wrapper for LogicalPlan::Union #240 (iajoiner)
  • feature: Create dataframe from pandas, polars, dictionary, list or pyarrow Table #242 (simicd)
  • Add Python wrappers for LogicalPlan::Join and LogicalPlan::CrossJoin #246 (iajoiner)
  • feature: Set table name from ctx functions #260 (simicd)
  • Explain bindings #264 (jdye64)
  • Extension bindings #266 (jdye64)
  • Subquery alias bindings #269 (jdye64)
  • Create memory table #271 (jdye64)
  • Create view bindings #273 (jdye64)
  • Re-export Datafusion dependencies #277 (jdye64)
  • Distinct bindings #275 (jdye64)
  • Drop table bindings #283 (jdye64)
  • Bindings for LogicalPlan::Repartition #285 (jdye64)
  • Expand Rust return type support for Arrow DataTypes in ScalarValue #287 (jdye64)

Documentation updates:

  • docs: Example of calling Python UDF & UDAF in SQL #258 (simicd)

Merged pull requests:

  • Minor docs updates #210 (andygrove)
  • Empty relation bindings #208 (jdye64)
  • wrap display_name and canonical_name functions #214 (jdye64)
  • Add PyAlias bindings #216 (jdye64)
  • Add bindings for scalar_variable #218 (jdye64)
  • Bindings for LIKE type expressions #220 (jdye64)
  • Bool expr bindings #223 (jdye64)
  • Between bindings #229 (jdye64)
  • Add bindings for GetIndexedField #227 (jdye64)
  • Add bindings for case, cast, and trycast #232 (jdye64)
  • add remaining expr bindings #233 (jdye64)
  • Pre-commit hooks #228 (jdye64)
  • Implement new release process #149 (andygrove)
  • feature: Additional export methods #236 (simicd)
  • Add Python wrapper for LogicalPlan::Union #240 (iajoiner)
  • feature: Create dataframe from pandas, polars, dictionary, list or pyarrow Table #242 (simicd)
  • Fix release instructions #238 (andygrove)
  • Add Python wrappers for LogicalPlan::Join and LogicalPlan::CrossJoin #246 (iajoiner)
  • docs: Example of calling Python UDF & UDAF in SQL #258 (simicd)
  • feature: Set table name from ctx functions #260 (simicd)
  • Upgrade to DataFusion 19 #262 (andygrove)
  • Explain bindings #264 (jdye64)
  • Extension bindings #266 (jdye64)
  • Subquery alias bindings #269 (jdye64)
  • Create memory table #271 (jdye64)
  • Create view bindings #273 (jdye64)
  • Re-export Datafusion dependencies #277 (jdye64)
  • Distinct bindings #275 (jdye64)
  • build(deps): bump actions/checkout from 2 to 3 #244 (dependabot[bot])
  • build(deps): bump actions/upload-artifact from 2 to 3 #245 (dependabot[bot])
  • build(deps): bump actions/download-artifact from 2 to 3 #243 (dependabot[bot])
  • Use DataFusion 20 #278 (andygrove)
  • Drop table bindings #283 (jdye64)
  • Bindings for LogicalPlan::Repartition #285 (jdye64)
  • Expand Rust return type support for Arrow DataTypes in ScalarValue #287 (jdye64)

0.8.0 (2023-02-22)

Full Changelog

Implemented enhancements:

  • Add support for cuDF physical execution engine #202
  • Make it easier to create a Pandas dataframe from DataFusion query results #139

Fixed bugs:

  • Build error: could not compile thiserror due to 2 previous errors #69

Closed issues:

  • Integrate with the new object_store crate #22

Merged pull requests:

0.8.0-rc1 (2023-02-17)

Full Changelog

Implemented enhancements:

  • Add bindings for datafusion_common::DFField #184
  • Add bindings for DFSchema/DFSchemaRef #181
  • Add bindings for datafusion_expr Projection #179
  • Add bindings for TableScan struct from datafusion_expr::TableScan #177
  • Add a "mapping" struct for types #172
  • Improve string representation of datafusion classes (dataframe, context, expression, ...) #158
  • Add DataFrame count method #151
  • [REQUEST] Github Actions Improvements #146
  • Change default branch name from master to main #144
  • Bump pyo3 to 0.18.0 #140
  • Add script for Python linting #134
  • Add Python bindings for substrait module #132
  • Expand unit tests for built-in functions #128
  • support creating arrow-datafusion-python conda environment #122
  • Build Python source distribution in GitHub workflow #81
  • EPIC: Add all functions to python binding functions #72

Fixed bugs:

  • Build is broken #161
  • Out of memory when sorting #157
  • window_lead test appears to be non-deterministic #135
  • Reading csv does not work #130
  • Github actions produce a lot of warnings #94
  • ASF source release tarball has wrong directory name #90
  • Python Release Build failing after upgrading to maturin 14.2 #87
  • Maturin build hangs on Linux ARM64 #84
  • Cannot install on Mac M1 from source tarball from testpypi #82
  • ImportPathMismatchError when running pytest locally #77

Closed issues:

  • Publish documentation for Python bindings #39
  • Add Python binding for approx_median #32
  • Release version 0.7.0 #7

0.7.0-rc2 (2022-11-26)

Full Changelog

Full Changelog

Merged pull requests:

0.5.1 (2022-03-15)

Full Changelog

0.5.1-rc1 (2022-03-15)

Full Changelog

0.5.0 (2022-03-10)

Full Changelog

0.5.0-rc2 (2022-03-10)

Full Changelog

Closed issues:

  • Add support for Ballista #37
  • Implement DataFrame.explain #35

0.5.0-rc1 (2022-03-09)

Full Changelog

Closed issues:

  • Investigate exposing additional optimizations #28
  • Use custom allocator in Python build #27
  • Why is pandas a requirement? #24
  • Unable to build #18
  • Setup CI against multiple Python version #6