forked from apache/arrow
-
Notifications
You must be signed in to change notification settings - Fork 0
bwl1289/feat/merge-upstream #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…erification script (apache#46612) ### Rationale for this change The current verification jobs for dotnet are failing. ### What changes are included in this PR? Microsoft seems to have updated their download URLs. Update accordingly see: https://dotnet.microsoft.com/en-us/download/dotnet/thank-you/sdk-8.0.204-linux-x64-binaries ### Are these changes tested? Yes, via archery. ### Are there any user-facing changes? No * GitHub Issue: apache#46605 Authored-by: Raúl Cumplido <raulcumplido@gmail.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
### Rationale for this change We want to run related checks conveniently. For example, we want to run C++ related check conveniently. ### What changes are included in this PR? Use language for `alias`. Example: ```console $ nice pre-commit run --show-diff-on-failure --color=always --all-files cpp C++ Format...............................................................Passed C++ Lint.................................................................Passed CMake Format.............................................................Passed meson....................................................................Passed ``` ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: apache#46598 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
… build (apache#46521) ### Rationale for this change The docs don't build cleanly for me locally and some warnings and errors that are consequential are being ignored. Some of these issues ended up being as severe as entirely missing tables. ### What changes are included in this PR? - Fixes for each issue I ran into. Each change is a commit with whatever extra detail I needed included in the commit body. - This also includes an auto-update of our Doxyfile which silences a number of deprecation warnings. I took a look at the diff and it seemed pretty safe and the docs look good to me locally. We can do a preview in CI first though. ### Are these changes tested? Yes, each change was tested locally. ### Are there any user-facing changes? No. * GitHub Issue: apache#46520 Authored-by: Bryce Mecum <petridish@gmail.com> Signed-off-by: Bryce Mecum <petridish@gmail.com>
…apache#46484) ### Rationale for this change This makes it easier to submit Arrow as a wrapdb entry, as the external flatbuffers version is difficult to control. Since it isn't used to generate the flatc files contained within the repo, it is better to just stick with the vendored headers ### What changes are included in this PR? Removed flatbuffers Meson subproject and use internal dependency ### Are these changes tested? Yes ### Are there any user-facing changes? No * GitHub Issue: apache#46477 Authored-by: Will Ayd <william.ayd@icloud.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…pache#46604) ### Rationale for this change We have moved the JavaScript implementation to https://github.com/apache/arrow-js and we won't be releasing JavaScript as part of the main apache/arrow repository. ### What changes are included in this PR? Remove related release code from this repo and update release documentation. ### Are these changes tested? No ### Are there any user-facing changes? No * GitHub Issue: apache#46603 Authored-by: Raúl Cumplido <raulcumplido@gmail.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…e#43439) ### Rationale for this change Support for Struct type has been added to the Swift ArrayReader. This change adds this support to the ArrowWriter. ### What changes are included in this PR? Updates to the ArrowWriter to support the Struct type. ### Are these changes tested? Yes, test included in PR * GitHub Issue: apache#43170 Authored-by: Alva Bandy <abandy@live.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…scripts directory (apache#46527) ### Rationale for this change We are trying to implement shellcheck on all sh files in apache#44748. ### What changes are included in this PR? SC2086 and SC2223 checks require quoting like `${arrow_dir}` -> `"${arrow_dir}"`. ``` In ci/scripts/integration_arrow_build.sh line 25: : ${ARROW_INTEGRATION_CPP:=ON} ^--------------------------^ SC2223 (info): This default assignment may cause DoS due to globbing. Quote it. In ci/scripts/integration_arrow_build.sh line 29: . ${arrow_dir}/ci/scripts/util_log.sh ^----------^ SC2086 (info): Double quote to prevent globbing and word splitting. ``` ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: apache#46526 Lead-authored-by: Hiroyuki Sato <hiroysato@gmail.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…46621) ### Rationale for this change It seems that the default Python was changed to 3.12 from 3.11. ### What changes are included in this PR? Use Python 3.12 development package. ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: apache#46610 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
…er repos (apache#46583) ### Rationale for this change Updating docs site to more easily link to other repos ### What changes are included in this PR? Docs updates ### Are these changes tested? Nope ### Are there any user-facing changes? To the docs, aye * GitHub Issue: apache#46582 --------- Co-authored-by: Ian Cook <ianmcook@gmail.com>
### Rationale for this change GLib should be able to use `arrow::BaseListType`. ### What changes are included in this PR? Add `GArrowBaseListDataType` and use it as a base class of `GArrowListDataType`. ### Are these changes tested? Yes. ### Are there any user-facing changes? Yes. * GitHub Issue: apache#46613 Lead-authored-by: Hiroyuki Sato <hiroysato@gmail.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…_creation - manual rebase (apache#46619) See apache#41998 - rebase was too messy due to PR age so did this manually. * GitHub Issue: apache#41973 Lead-authored-by: Haocheng Liu <lbtinglb@gmail.com> Co-authored-by: Nic Crane <thisisnic@gmail.com> Signed-off-by: Nic Crane <thisisnic@gmail.com>
…he#46639) ### Rationale for this change I edited NEWS.md for the 20.0.0.X release and we need to backport those changes. ### What changes are included in this PR? I cherry-picked: - 736e6cc - bd7024d and made one small whitespace change since I was already in the file. ### Are these changes tested? No. ### Are there any user-facing changes? No. Authored-by: Bryce Mecum <petridish@gmail.com> Signed-off-by: Bryce Mecum <petridish@gmail.com>
…ache#46622) ### Rationale for this change Resolve warnings that get treated as errors by suppressing the warning. ### What changes are included in this PR? 1. Add `ARROW_SUPPRESS_DEPRECATION_WARNING` to code that use `codecvt_utf8` 2. Add explicit conversions for integer values in `cpp/src/arrow/flight/sql/odbc/odbcabstraction/include/odbcabstraction/odbc_impl/attribute_utils.h` ### Are these changes tested? They are tested locally on my Windows environment. The build succeeds. ### Are there any user-facing changes? None * GitHub Issue: apache#46576 Authored-by: Alina (Xi) Li <alina.li@improving.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…#46594) ### Rationale for this change GitHub Actions support log grouping: https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/workflow-commands-for-github-actions#grouping-log-lines But it doesn't support nested grouping. If we nest grouping, some logs may not be grouped. See apache#46593 for example. ### What changes are included in this PR? Stop GitHub Actions workflow commands including grouping while grouping. ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: apache#46593 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…e#46501) ### Rationale for this change We're using Crossbow to offload CI jobs to separated repository but it has some inconveniences. See apache#46014 for details. ### What changes are included in this PR? Use `CI: Extra` label to run Meson CI job as an extra CI job. ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: apache#46499 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…e#45859) ### Rationale for this change conda-forge/arrow-cpp-feedstock#1727 fails to build because open-telemetry/opentelemetry-cpp@762b73d is a silent breaking change. ### What changes are included in this PR? Work around the breaking change. ### Are these changes tested? I tested that it built with OTel 1.19 and 1.18. ### Are there any user-facing changes? No Lead-authored-by: David Li <li.davidm96@gmail.com> Co-authored-by: Rossi Sun <zanmato1984@gmail.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…ues if installing with pip (apache#46591) Thanks for opening a pull request! ### Rationale for this change When installing pyarrow with pip, it is possible to experience a confusing time zone related error when writing files with datetime data. apache#46080 discusses the problem and a solution was eventually found. This change provides helpful information to other users who may experience the problem. ### What changes are included in this PR? Update to python installation instructions that offers guidance on how to handle tzdata related errors ### Are these changes tested? I was unable to build the documentation locally. I will use the github action to preview the documentation ### Are there any user-facing changes? No * GitHub Issue: apache#46080 Lead-authored-by: Kyle Hemker <k@hemker.us> Co-authored-by: Alenka Frim <AlenkaF@users.noreply.github.com> Co-authored-by: kahemker <github@kylehemker.uk> Co-authored-by: kahemker <kyle.hemker@gmail.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Co-authored-by: Raúl Cumplido <raulcumplido@gmail.com> Signed-off-by: AlenkaF <frim.alenka@gmail.com>
### Rationale for this change Obvious typo. ### What changes are included in this PR? Fix a typo. ### Are these changes tested? Yes. ### Are there any user-facing changes? Yes. Authored-by: Rossi Sun <zanmato1984@gmail.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…#46156) ### Rationale for this change This continues improving the coverage of the Meson build configuration ### What changes are included in this PR? Added the Tensorflow directory ### Are these changes tested? Yes ### Are there any user-facing changes? No * GitHub Issue: apache#46155 Authored-by: Will Ayd <william.ayd@icloud.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…ci/scripts directory (apache#46657) ### Rationale for this change We are trying to implement shellcheck on all sh files in apache#44748. ### What changes are included in this PR? * SC2034 unused variable error. Use variable properly like `${1}` -> `${arrow_dir}`. * SC2086 check require quoting like `${download_url}` -> `"${download_url}"`. ``` In ci/scripts/install_conda.sh line 30: version=$2 ^-----^ SC2034 (warning): version appears unused. Verify use (or export if used externally). In ci/scripts/install_conda.sh line 37: wget -nv ${download_url} -O /tmp/installer.sh ^-------------^ SC2086 (info): Double quote to prevent globbing and word splitting. ``` ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: apache#46656 Lead-authored-by: Hiroyuki Sato <hiroysato@gmail.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…pache#46654) ### Rationale for this change `arrow::SchemaBuidler::AddMetadata()` replaces metadata not adds metadata. ### What changes are included in this PR? Add metadata when calling `SchemaBuidler::AddMetadata()`. ### Are these changes tested? Manually build pass. ### Are there any user-facing changes? **This PR includes breaking changes to public APIs.** `arrow::SchemaBuidler::AddMetadata()` adds not replaces the given metadata. If an existing code calls `AddMetadata()` multiple times, the behavior will be changed. * GitHub Issue: apache#46146 Authored-by: Ziy1-Tan <ajb459684460@gmail.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…directory (apache#46663) ### Rationale for this change We are trying to implement shellcheck on all sh files in apache#44748. ### What changes are included in this PR? * SC2148: Add `shellcheck shell=bash` in `util_*` files ``` In ci/scripts/util_enable_core_dumps.sh line 1: # Licensed to the Apache Software Foundation (ASF) under one ^-- SC2148 (error): Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive. ``` ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: apache#46662 Authored-by: Hiroyuki Sato <hiroysato@gmail.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…#46595) ### Rationale for this change We want to migrate to pre-commit from `archery lint`. ### What changes are included in this PR? Use pre-commit for numpydoc but this doesn't support Cython files. ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: apache#46546 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
… range (apache#46590) ### Rationale for this change `pyarrow.compute.utf8_is_digit` did not recognize valid Unicode digit characters (e.g., superscripts like `'³'`), diverging from the behavior of Python's built-in `str.isdigit()` This caused inconsistencies in downstream libraries like pandas when using PyArrow-backed StringDtype. ### What changes are included in this PR? Updated `IsDigitCharacterUnicode` implementation to cover a broader range of Unicode digits by replacing category check with one that aligns with Python’s `str.isdigit()` semantics. Added tests in `scalar_string_test.cc` to validate correct digit detection across diverse Unicode digit inputs. ### Are these changes tested? Yes. New unit tests were added and pass successfully, verifying behavior on various Unicode digit characters. ### Are there any user-facing changes? Yes, users relying on `pc.utf8_is_digit()` will now get correct results for a wider range of Unicode digit characters, improving correctness and parity with Python semantics * GitHub Issue: apache#46589 Lead-authored-by: iabhi4 <iamonecool@gmail.com> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
…n to specify that binary columns can be combined into multiple chunks (apache#46638) ### Rationale for this change The documentation for [pyarrow.Table.combine_chunks](https://arrow.apache.org/docs/python/generated/pyarrow.Table.html#pyarrow.Table.combine_chunks) and [Table::CombineChunks](https://arrow.apache.org/docs/cpp/api/table.html#_CPPv4NK5arrow5Table13CombineChunksEP10MemoryPool) states: All the underlying chunks in the ChunkedArray of each column are concatenated into zero or one chunk. However, [this comment](https://github.com/apache/arrow/blob/d7015bd6e610b6cd6752f6cd543509bd5f8853ff/cpp/src/arrow/table.cc#L567) indicates that binary columns can be combined into multiple chunks. Multiple chunks are produced when combining into one chunk would result in a buffer overflow. A reproducible example is [here](apache#46633 (comment)). ### What changes are included in this PR? Change `Table::CombineChunks` and `pyarrow.Table.combine_chunks` documentation to specify that binary columns can be combined into multiple chunks. ### Are these changes tested? No, they are only documentation changes. ### Are there any user-facing changes? Yes, documentation changes. * GitHub Issue: apache#46633 Authored-by: Akum Kang <kangakum@Akums-MacBook-Pro-2.local> Signed-off-by: AlenkaF <frim.alenka@gmail.com>
…n arrow-compute-row-test (apache#46635) ### Rationale for this change In apache#45336 we refined the row table buffer accessors and enforced the validation on who can call the `var_length_rows()` method. However a legacy test `CompareColumnsToRowsOver4GBFixedLength` is leveraging this accessor to assert this buffer being null. ### What changes are included in this PR? We can just check if the row table is fixed length. ### Are these changes tested? Yes. ### Are there any user-facing changes? None. * GitHub Issue: apache#46623 Authored-by: Rossi Sun <zanmato1984@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>
…ion (apache#46620) ### Rationale for this change We now support more types but the documentation suggested that some weren't supported. ### What changes are included in this PR? Documentation was updated to reflect the status of supported types. ### Are these changes tested? No code changes! ### Are there any user-facing changes? No * GitHub Issue: apache#46599 Lead-authored-by: Dewey Dunnington <dewey@wherobots.com> Co-authored-by: Dewey Dunnington <dewey@fishandwhistle.net> Co-authored-by: Antoine Pitrou <pitrou@free.fr> Signed-off-by: Dewey Dunnington <dewey@wherobots.com>
…e#46649) ### Rationale for this change The distinction between "invalid" and "empty" is not clear in the current documentation! ### What changes are included in this PR? The docstring for GeoStatistics was improved. ### Are these changes tested? Just documention! ### Are there any user-facing changes? No * GitHub Issue: apache#46270 Authored-by: Dewey Dunnington <dewey@wherobots.com> Signed-off-by: Dewey Dunnington <dewey@wherobots.com>
…on (apache#46274) ### Rationale for this change This option has been in our code base for some time but is not tested and may no longer work. The consensus in apache#46219 was to remove it. ### What changes are included in this PR? The option is removed! ### Are these changes tested? Yes (covered by the Parquet builds that had previously not tested this option!) ### Are there any user-facing changes? Yes, a build option was removed; however, I wasn't able to find where this build option was documented in the first place! * GitHub Issue: apache#46219 Authored-by: Dewey Dunnington <dewey@wherobots.com> Signed-off-by: Dewey Dunnington <dewey@wherobots.com>
… n) random access (apache#46643) ### Rationale for this change Resolves apache#46642. ### What changes are included in this PR? - Updated columnar format doc ### Are these changes tested? Yes, rendered locally. ### Are there any user-facing changes? No. * GitHub Issue: apache#46642 Authored-by: Bryce Mecum <petridish@gmail.com> Signed-off-by: Bryce Mecum <petridish@gmail.com>
…pache#47166) ### Rationale for this change Tests are failing because the bucket exists now. ### What changes are included in this PR? Update bucket name to non-existing bucket so it raises the expected exception again. ### Are these changes tested? Yes ### Are there any user-facing changes? No * GitHub Issue: apache#47165 Authored-by: Raúl Cumplido <raulcumplido@gmail.com> Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
…ypes.rst (apache#47170) ### Rationale for this change I originally started this patch because the title of https://arrow.apache.org/docs/python/extending_types.html is "Extending pyarrow..." which immediately jumped out as me as not quite right: It should be "Extending PyArrow..." since PyArrow is the name of the software. I think we try to use the all-lowercase form only in code snippets or when specifically referring to the spelling of PyArrow various distributions. ### What changes are included in this PR? Changed docs/source/python/extending_types.rst to hopefully use the various forms correctly I didn't do a deep-dive into other docs. ### Are these changes tested? Yes, rendered locally. ### Are there any user-facing changes? No. Authored-by: Bryce Mecum <petridish@gmail.com> Signed-off-by: AlenkaF <frim.alenka@gmail.com>
…st/dev/arrow/KEYS (apache#47182) ### Rationale for this change We only need https://dist.apache.org/repos/dist/release/arrow/KEYS . ### What changes are included in this PR? Always use https://dist.apache.org/repos/dist/release/arrow/KEYS . ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: apache#47084 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
…mpl to reuse ipc::RecordBatchWriter with custom IpcPayloadWriter instead of manually generating FlightPayload (apache#47115) ### Rationale for this change Currently Dictionary replacement or deltas are not supported in DoGet as seen on the original issue. We could replicate the `ipc::WriteDictionaries` code but instead of replicating this logic we can also update our flight server `RecordBatchStreamImpl` to reuse `ipc::RecordBatchWriter` and just update the payload to match the flight payload. ### What changes are included in this PR? Updated logic on `arrow::flight::RecordBatchStream::RecordBatchStreamImpl` to use an `ipc::RecordBatchWriter` and a custom `ipc::IpcPayloadWriter` instead of manually buidling the payloads. ### Are these changes tested? Yes via CI. Test data for some Python tests has been updated to validate dictionary replacement is working now. On main the updated test fails with: ```python > assert data.equals(table) E assert False E + where False = equals(pyarrow.Table\nsome_dicts: dictionary<values=string, indices=int64, ordered=0>\n----\nsome_dicts: [ -- dictionary:\n["foo...\n[1,0,null], -- dictionary:\n["foo","baz","quux"] -- indices:\n[2,1], -- dictionary:\n["bar","qux"] -- indices:\n[0,1]]) E + where equals = pyarrow.Table\nsome_dicts: dictionary<values=string, indices=int64, ordered=0>\n----\nsome_dicts: [ -- dictionary:\n["foo...ull], -- dictionary:\n["foo","baz","quux"] -- indices:\n[2,1], -- dictionary:\n["foo","baz","quux"] -- indices:\n[0,1]].equals pyarrow/tests/test_flight.py:1319: AssertionError ================================================================================== short test summary info =================================================================================== FAILED pyarrow/tests/test_flight.py::test_flight_do_get_dicts - assert False FAILED pyarrow/tests/test_flight.py::test_flight_domain_socket - assert False ``` ### Are there any user-facing changes? No * GitHub Issue: apache#45055 Authored-by: Raúl Cumplido <raulcumplido@gmail.com> Signed-off-by: David Li <li.davidm96@gmail.com>
### Rationale for this change Please see Github Issue apache#47123 ### What changes are included in this PR? Added public Type Enums that mimic the original private variable groups used for internal type checking. ### Are these changes tested? Yes. Partly for now. ### Are there any user-facing changes? No, just additional features were added: they will now able to access the underlying types directly via the Type Enums. * GitHub Issue: apache#47123 Lead-authored-by: Bogdan Romenskii <rmnsk@seznam.cz> Co-authored-by: Alenka Frim <AlenkaF@users.noreply.github.com> Signed-off-by: AlenkaF <frim.alenka@gmail.com>
… sync (apache#47194) ### Rationale for this change Unnecessary files being synced and causing portability warnings ### What changes are included in this PR? Don't sync 'em ### Are these changes tested? Nope ### Are there any user-facing changes? Nope * GitHub Issue: apache#47193 Authored-by: Nic Crane <thisisnic@gmail.com> Signed-off-by: Nic Crane <thisisnic@gmail.com>
apache#47192) ### Rationale for this change GCS wasn't on by default on MacOS source builds ### What changes are included in this PR? Turn it on ### Are these changes tested? Nah ### Are there any user-facing changes? Nope * GitHub Issue: apache#47191 Authored-by: Nic Crane <thisisnic@gmail.com> Signed-off-by: Nic Crane <thisisnic@gmail.com>
…linux-devel (apache#47212) ### Rationale for this change We need to disable these at buildtime, so that they are *also* disabled on CRAN. Resolves apache#47211 ### What changes are included in this PR? Remove the env var set in CI, set the flag when building ### Are these changes tested? They are tests ### Are there any user-facing changes? Hopefully no! * GitHub Issue: apache#47211 Authored-by: Jonathan Keane <jkeane@gmail.com> Signed-off-by: Jonathan Keane <jkeane@gmail.com>
…nt variable (apache#47181) ### Rationale for this change We have many environment variables for GitHub token: `GH_TOKEN`, `ARROW_GITHUB_API_TOKEN` and `CROSSBOW_GITHUB_TOKEN` It's difficult to maintain. For example, we may forget to define one of them. ### What changes are included in this PR? Use `GH_TOKEN` as unified environment variable for GitHub token. We can still use `ARROW_GITHUB_API_TOKEN` and `CROSSBOW_GITHUB_TOKEN` for backward compatibility but `GH_TOKEN` is recommended. ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: apache#47075 Lead-authored-by: Sutou Kouhei <kou@clear-code.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Co-authored-by: Bryce Mecum <petridish@gmail.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
### Rationale for this change We need xsimd 13.0.0 or later since apache#46963 . ### What changes are included in this PR? Require 13.0.0 or later for system xsimd. ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: apache#47175 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
… Apache Thrift (apache#47209) ### Rationale for this change If we use bundled Apache Thrift, `libarrowd.a` not `libarrow.a` may be built. Because Apache Thrift may change `CMAKE_DEBUG_POSTFIX`. ### What changes are included in this PR? Restore `CMAKE_DEBUG_POSTFIX` changed by Apache Thrift. ### Are these changes tested? Yes. ### Are there any user-facing changes? Yes. * GitHub Issue: apache#47203 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…attribute ARROW:distinct_count:approximate (apache#47183) ### Rationale for this change Implement support for the `ARROW:distinct_count:approximate` statistics attribute. ### What changes are included in this PR? Changed the type of `arrow::ArrayStatistics::distinct_count` to support both `double` (for approximate values) and `int64_t` (for exact values). ### Are these changes tested? Yes, I ran the corresponding unit test. ### Are there any user-facing changes? Yes. * C++: The type of `arrow::ArrayStatistics::distinct_count` was changed from `std::optional<int64_t>` to `std::optional<std::variant<int64_t, double>>`. * GLib: `garrow_array_statistics_get_distinct_count()` is deprecated. * GitHub Issue: apache#47101 Lead-authored-by: Arash Andishgar <arashandishgar1@gmail.com> Co-authored-by: Sutou Kouhei <kou@clear-code.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Co-authored-by: Rok Mihevc <rok@mihevc.org> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…nt WriteRecordBatch (apache#47129) ### Rationale for this change Throttle is accessed twice - once in Acquire and again using future. As a result current_value_ may not be increased due to throttle being applied and shorty after the returned future may become finished. That leads to issue described in apache#47124 https://github.com/apache/arrow/blob/c8fe26898ce49c58514f511be58afddce176826b/cpp/src/arrow/dataset/dataset_writer.cc#L682-L684 ### What changes are included in this PR? Change throttle API to return optional (akin to [ThrottledAsyncTaskScheduler ::Throttle](https://github.com/gitmodimo/arrow/blob/3ebe7ee1828793d0a619bcd773eb4d990ccb6b3c/cpp/src/arrow/util/async_util.h#L243)) and prevent race. ### Are these changes tested? Yes ### Are there any user-facing changes? No * GitHub Issue: apache#47124 Lead-authored-by: Rafał Hibner <rafal.hibner@secom.com.pl> Co-authored-by: gitmodimo <g.modimo@gmail.com> Co-authored-by: Rossi Sun <zanmato1984@gmail.com> Signed-off-by: Rossi Sun <zanmato1984@gmail.com>
apache#47017) ### Rationale for this change Fix negative timestamps from arrow to date structs when using flightsql odbc. ### Are these changes tested? Yes ### Are there any user-facing changes? No * GitHub Issue: apache#47016 Authored-by: Igor Antropov <igor.antropov@dremio.com> Signed-off-by: David Li <li.davidm96@gmail.com>
### Rationale for this change Add previous version of R package to backward compatibility job so we can ensure we don't make breaking changes ### What changes are included in this PR? Add 21.0.0 to CI job ### Are these changes tested? No ### Are there any user-facing changes? No Authored-by: Nic Crane <thisisnic@gmail.com> Signed-off-by: Jacob Wujciak-Jens <jacob@wujciak.de>
… with upstream and resolved merge conflicts
|
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename the pull request title in the following format? or See also: |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?