Skip to content

Conversation

@BwL1289
Copy link

@BwL1289 BwL1289 commented Aug 11, 2025

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

raulcd and others added 30 commits May 28, 2025 06:03
…erification script (apache#46612)

### Rationale for this change

The current verification jobs for dotnet are failing.

### What changes are included in this PR?

Microsoft seems to have updated their download URLs. Update accordingly see:
https://dotnet.microsoft.com/en-us/download/dotnet/thank-you/sdk-8.0.204-linux-x64-binaries

### Are these changes tested?

Yes, via archery.

### Are there any user-facing changes?

No

* GitHub Issue: apache#46605

Authored-by: Raúl Cumplido <raulcumplido@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
### Rationale for this change

We want to run related checks conveniently. For example, we want to run C++ related check conveniently.

### What changes are included in this PR?

Use language for `alias`.

Example:

```console
$ nice pre-commit run --show-diff-on-failure --color=always --all-files cpp
C++ Format...............................................................Passed
C++ Lint.................................................................Passed
CMake Format.............................................................Passed
meson....................................................................Passed
```

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#46598

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
… build (apache#46521)

### Rationale for this change

The docs don't build cleanly for me locally and some warnings and errors that are consequential are being ignored. Some of these issues ended up being as severe as entirely missing tables.

### What changes are included in this PR?

- Fixes for each issue I ran into. Each change is a commit with whatever extra detail I needed included in the commit body.
- This also includes an auto-update of our Doxyfile which silences a number of deprecation warnings. I took a look at the diff and it seemed pretty safe and the docs look good to me locally. We can do a preview in CI first though.

### Are these changes tested?

Yes, each change was tested locally.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#46520

Authored-by: Bryce Mecum <petridish@gmail.com>
Signed-off-by: Bryce Mecum <petridish@gmail.com>
…apache#46484)

### Rationale for this change

This makes it easier to submit Arrow as a wrapdb entry, as the external flatbuffers version is difficult to control. Since it isn't used to generate the flatc files contained within the repo, it is better to just stick with the vendored headers

### What changes are included in this PR?

Removed flatbuffers Meson subproject and use internal dependency

### Are these changes tested?

Yes

### Are there any user-facing changes?

No
* GitHub Issue: apache#46477

Authored-by: Will Ayd <william.ayd@icloud.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…pache#46604)

### Rationale for this change

We have moved the JavaScript implementation to https://github.com/apache/arrow-js and we won't be releasing JavaScript as part of the main apache/arrow repository.

### What changes are included in this PR?

Remove related release code from this repo and update release documentation.

### Are these changes tested?

No

### Are there any user-facing changes?

No

* GitHub Issue: apache#46603

Authored-by: Raúl Cumplido <raulcumplido@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…e#43439)

### Rationale for this change
Support for Struct type has been added to the Swift ArrayReader.  This change adds this support to the ArrowWriter.

### What changes are included in this PR?
Updates to the ArrowWriter to support the Struct type.

### Are these changes tested?
Yes, test included in PR

* GitHub Issue: apache#43170

Authored-by: Alva Bandy <abandy@live.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…scripts directory (apache#46527)

### Rationale for this change

We are trying to implement shellcheck on all sh files in apache#44748.

### What changes are included in this PR?

SC2086 and SC2223 checks require quoting like `${arrow_dir}` -> `"${arrow_dir}"`.

```
In ci/scripts/integration_arrow_build.sh line 25:
: ${ARROW_INTEGRATION_CPP:=ON}
  ^--------------------------^ SC2223 (info): This default assignment may cause DoS due to globbing. Quote it.

In ci/scripts/integration_arrow_build.sh line 29:
. ${arrow_dir}/ci/scripts/util_log.sh
  ^----------^ SC2086 (info): Double quote to prevent globbing and word splitting.

```

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.

* GitHub Issue: apache#46526

Lead-authored-by: Hiroyuki Sato <hiroysato@gmail.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…46621)

### Rationale for this change

It seems that the default Python was changed to 3.12 from 3.11. 

### What changes are included in this PR?

Use Python 3.12 development package.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#46610

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
…er repos (apache#46583)

### Rationale for this change

Updating docs site to more easily link to other repos

### What changes are included in this PR?

Docs updates

### Are these changes tested?

Nope

### Are there any user-facing changes?

To the docs, aye
* GitHub Issue: apache#46582

---------

Co-authored-by: Ian Cook <ianmcook@gmail.com>
### Rationale for this change

GLib should be able to use `arrow::BaseListType`.

### What changes are included in this PR?

Add `GArrowBaseListDataType` and use it as a base class of `GArrowListDataType`.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.
* GitHub Issue: apache#46613

Lead-authored-by: Hiroyuki Sato <hiroysato@gmail.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…_creation - manual rebase (apache#46619)

See apache#41998 - rebase was too messy due to PR age so did this manually.
* GitHub Issue: apache#41973

Lead-authored-by: Haocheng Liu <lbtinglb@gmail.com>
Co-authored-by: Nic Crane <thisisnic@gmail.com>
Signed-off-by: Nic Crane <thisisnic@gmail.com>
…he#46639)

### Rationale for this change

I edited NEWS.md for the 20.0.0.X release and we need to backport those changes.

### What changes are included in this PR?

I cherry-picked:

- 736e6cc
- bd7024d

and made one small whitespace change since I was already in the file.

### Are these changes tested?

No.

### Are there any user-facing changes?

No.

Authored-by: Bryce Mecum <petridish@gmail.com>
Signed-off-by: Bryce Mecum <petridish@gmail.com>
…ache#46622)

### Rationale for this change

Resolve warnings that get treated as errors by suppressing the warning.

### What changes are included in this PR?

1. Add `ARROW_SUPPRESS_DEPRECATION_WARNING` to code that use `codecvt_utf8`
2. Add explicit conversions for integer values in `cpp/src/arrow/flight/sql/odbc/odbcabstraction/include/odbcabstraction/odbc_impl/attribute_utils.h`

### Are these changes tested?
They are tested locally on my Windows environment. The build succeeds. 

### Are there any user-facing changes?

None

* GitHub Issue: apache#46576

Authored-by: Alina (Xi) Li <alina.li@improving.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…#46594)

### Rationale for this change

GitHub Actions support log grouping:
https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/workflow-commands-for-github-actions#grouping-log-lines

But it doesn't support nested grouping.

If we nest grouping, some logs may not be grouped. See apache#46593 for example. 

### What changes are included in this PR?

Stop GitHub Actions workflow commands including grouping while grouping.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#46593

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…e#46501)

### Rationale for this change

We're using Crossbow to offload CI jobs to separated repository but it has some inconveniences. See apache#46014 for details.

### What changes are included in this PR?

Use `CI: Extra` label to run Meson CI job as an extra CI job.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#46499

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…e#45859)

### Rationale for this change

conda-forge/arrow-cpp-feedstock#1727 fails to build because open-telemetry/opentelemetry-cpp@762b73d is a silent breaking change.

### What changes are included in this PR?

Work around the breaking change.

### Are these changes tested?

I tested that it built with OTel 1.19 and 1.18.

### Are there any user-facing changes?

No

Lead-authored-by: David Li <li.davidm96@gmail.com>
Co-authored-by: Rossi Sun <zanmato1984@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…ues if installing with pip (apache#46591)

Thanks for opening a pull request!

### Rationale for this change

When installing pyarrow with pip, it is possible to experience a confusing time zone related error when writing files with datetime data.    apache#46080 discusses the problem and a solution was eventually found.  This change provides helpful information to other users who may experience the problem.

### What changes are included in this PR?

Update to python installation instructions that offers guidance on how to handle tzdata related errors

### Are these changes tested?

I was unable to build the documentation locally.  I will use the github action to preview the documentation

### Are there any user-facing changes?

No

* GitHub Issue: apache#46080

Lead-authored-by: Kyle Hemker <k@hemker.us>
Co-authored-by: Alenka Frim <AlenkaF@users.noreply.github.com>
Co-authored-by: kahemker <github@kylehemker.uk>
Co-authored-by: kahemker <kyle.hemker@gmail.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Co-authored-by: Raúl Cumplido <raulcumplido@gmail.com>
Signed-off-by: AlenkaF <frim.alenka@gmail.com>
### Rationale for this change

Obvious typo.

### What changes are included in this PR?

Fix a typo.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.

Authored-by: Rossi Sun <zanmato1984@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…#46156)

### Rationale for this change

This continues improving the coverage of the Meson build configuration

### What changes are included in this PR?

Added the Tensorflow directory

### Are these changes tested?

Yes

### Are there any user-facing changes?

No
* GitHub Issue: apache#46155

Authored-by: Will Ayd <william.ayd@icloud.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…ci/scripts directory (apache#46657)

### Rationale for this change

We are trying to implement shellcheck on all sh files in apache#44748.

### What changes are included in this PR?

* SC2034 unused variable error. Use variable properly like `${1}` -> `${arrow_dir}`.
* SC2086 check require quoting like `${download_url}` -> `"${download_url}"`.

```
In ci/scripts/install_conda.sh line 30:
version=$2
^-----^ SC2034 (warning): version appears unused. Verify use (or export if used externally).

In ci/scripts/install_conda.sh line 37:
wget -nv ${download_url} -O /tmp/installer.sh
         ^-------------^ SC2086 (info): Double quote to prevent globbing and word splitting.
```

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#46656

Lead-authored-by: Hiroyuki Sato <hiroysato@gmail.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…pache#46654)

### Rationale for this change

`arrow::SchemaBuidler::AddMetadata()` replaces metadata not adds metadata.

### What changes are included in this PR?

Add metadata when calling `SchemaBuidler::AddMetadata()`.

### Are these changes tested?

Manually build pass.

### Are there any user-facing changes?

**This PR includes breaking changes to public APIs.**

`arrow::SchemaBuidler::AddMetadata()` adds not replaces the given metadata. If an existing code calls `AddMetadata()` multiple times, the behavior will be changed.  

* GitHub Issue: apache#46146

Authored-by: Ziy1-Tan <ajb459684460@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…directory (apache#46663)

### Rationale for this change

We are trying to implement shellcheck on all sh files in apache#44748.

### What changes are included in this PR?

* SC2148: Add `shellcheck shell=bash` in `util_*` files
```
In ci/scripts/util_enable_core_dumps.sh line 1:
# Licensed to the Apache Software Foundation (ASF) under one
^-- SC2148 (error): Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.
```

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.

* GitHub Issue: apache#46662

Authored-by: Hiroyuki Sato <hiroysato@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…#46595)

### Rationale for this change

We want to migrate to pre-commit from `archery lint`.

### What changes are included in this PR?

Use pre-commit for numpydoc but this doesn't support Cython files.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#46546

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
… range (apache#46590)

### Rationale for this change

`pyarrow.compute.utf8_is_digit` did not recognize valid Unicode digit characters (e.g., superscripts like `'³'`), diverging from the behavior of Python's built-in `str.isdigit()`
This caused inconsistencies in downstream libraries like pandas when using PyArrow-backed StringDtype.

### What changes are included in this PR?
Updated `IsDigitCharacterUnicode` implementation to cover a broader range of Unicode digits by replacing category check with one that aligns with Python’s `str.isdigit()` semantics.

Added tests in `scalar_string_test.cc` to validate correct digit detection across diverse Unicode digit inputs.

### Are these changes tested?
Yes. New unit tests were added and pass successfully, verifying behavior on various Unicode digit characters.

### Are there any user-facing changes?
Yes, users relying on `pc.utf8_is_digit()` will now get correct results for a wider range of Unicode digit characters, improving correctness and parity with Python semantics

 
* GitHub Issue: apache#46589

Lead-authored-by: iabhi4 <iamonecool@gmail.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
…n to specify that binary columns can be combined into multiple chunks (apache#46638)

### Rationale for this change

The documentation for [pyarrow.Table.combine_chunks](https://arrow.apache.org/docs/python/generated/pyarrow.Table.html#pyarrow.Table.combine_chunks) and [Table::CombineChunks](https://arrow.apache.org/docs/cpp/api/table.html#_CPPv4NK5arrow5Table13CombineChunksEP10MemoryPool) states: All the underlying chunks in the ChunkedArray of each column are concatenated into zero or one chunk.

However, [this comment](https://github.com/apache/arrow/blob/d7015bd6e610b6cd6752f6cd543509bd5f8853ff/cpp/src/arrow/table.cc#L567) indicates that binary columns can be combined into multiple chunks. Multiple chunks are produced when combining into one chunk would result in a buffer overflow.

A reproducible example is [here](apache#46633 (comment)).

### What changes are included in this PR?

Change `Table::CombineChunks` and `pyarrow.Table.combine_chunks` documentation to specify that binary columns can be combined into multiple chunks.

### Are these changes tested?

No, they are only documentation changes.

### Are there any user-facing changes?

Yes, documentation changes.
* GitHub Issue: apache#46633

Authored-by: Akum Kang <kangakum@Akums-MacBook-Pro-2.local>
Signed-off-by: AlenkaF <frim.alenka@gmail.com>
…n arrow-compute-row-test (apache#46635)

### Rationale for this change

In apache#45336 we refined the row table buffer accessors and enforced the validation on who can call the `var_length_rows()` method. However a legacy test `CompareColumnsToRowsOver4GBFixedLength` is leveraging this accessor to assert this buffer being null.

### What changes are included in this PR?

We can just check if the row table is fixed length.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

None.

* GitHub Issue: apache#46623

Authored-by: Rossi Sun <zanmato1984@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
…ion (apache#46620)

### Rationale for this change

We now support more types but the documentation suggested that some weren't supported.

### What changes are included in this PR?

Documentation was updated to reflect the status of supported types.

### Are these changes tested?

No code changes!

### Are there any user-facing changes?

No
* GitHub Issue: apache#46599

Lead-authored-by: Dewey Dunnington <dewey@wherobots.com>
Co-authored-by: Dewey Dunnington <dewey@fishandwhistle.net>
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
Signed-off-by: Dewey Dunnington <dewey@wherobots.com>
…e#46649)

### Rationale for this change

The distinction between "invalid" and "empty" is not clear in the current documentation!

### What changes are included in this PR?

The docstring for GeoStatistics was improved.

### Are these changes tested?

Just documention!

### Are there any user-facing changes?

No
* GitHub Issue: apache#46270

Authored-by: Dewey Dunnington <dewey@wherobots.com>
Signed-off-by: Dewey Dunnington <dewey@wherobots.com>
…on (apache#46274)

### Rationale for this change

This option has been in our code base for some time but is not tested and may no longer work. The consensus in apache#46219 was to remove it.

### What changes are included in this PR?

The option is removed!

### Are these changes tested?

Yes (covered by the Parquet builds that had previously not tested this option!)

### Are there any user-facing changes?

Yes, a build option was removed; however, I wasn't able to find where this build option was documented in the first place!

* GitHub Issue: apache#46219

Authored-by: Dewey Dunnington <dewey@wherobots.com>
Signed-off-by: Dewey Dunnington <dewey@wherobots.com>
… n) random access (apache#46643)

### Rationale for this change

Resolves apache#46642.

### What changes are included in this PR?

- Updated columnar format doc

### Are these changes tested?

Yes, rendered locally.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#46642

Authored-by: Bryce Mecum <petridish@gmail.com>
Signed-off-by: Bryce Mecum <petridish@gmail.com>
raulcd and others added 16 commits July 22, 2025 17:06
…pache#47166)

### Rationale for this change

Tests are failing because the bucket exists now.

### What changes are included in this PR?

Update bucket name to non-existing bucket so it raises the expected exception again.

### Are these changes tested?

Yes

### Are there any user-facing changes?

No

* GitHub Issue: apache#47165

Authored-by: Raúl Cumplido <raulcumplido@gmail.com>
Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
…ypes.rst (apache#47170)

### Rationale for this change

I originally started this patch because the title of https://arrow.apache.org/docs/python/extending_types.html is "Extending pyarrow..." which immediately jumped out as me as not quite right: It should be "Extending PyArrow..." since PyArrow is the name of the software. I think we try to use the all-lowercase form only in code snippets or when specifically referring to the spelling of PyArrow various distributions.

### What changes are included in this PR?

Changed docs/source/python/extending_types.rst to hopefully use the various forms correctly

I didn't do a deep-dive into other docs.

### Are these changes tested?

Yes, rendered locally.

### Are there any user-facing changes?

No.

Authored-by: Bryce Mecum <petridish@gmail.com>
Signed-off-by: AlenkaF <frim.alenka@gmail.com>
…st/dev/arrow/KEYS (apache#47182)

### Rationale for this change

We only need https://dist.apache.org/repos/dist/release/arrow/KEYS .

### What changes are included in this PR?

Always use https://dist.apache.org/repos/dist/release/arrow/KEYS .

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#47084

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
…mpl to reuse ipc::RecordBatchWriter with custom IpcPayloadWriter instead of manually generating FlightPayload (apache#47115)

### Rationale for this change

Currently Dictionary replacement or deltas are not supported in DoGet as seen on the original issue.
We could replicate the `ipc::WriteDictionaries` code but instead of replicating this logic we can also update our flight server `RecordBatchStreamImpl` to reuse `ipc::RecordBatchWriter` and just update the payload to match the flight payload.

### What changes are included in this PR?

Updated logic on `arrow::flight::RecordBatchStream::RecordBatchStreamImpl` to use an `ipc::RecordBatchWriter` and a custom `ipc::IpcPayloadWriter` instead of manually buidling the payloads.

### Are these changes tested?

Yes via CI. Test data for some Python tests has been updated to validate dictionary replacement is working now. On main the updated test fails with:
```python
>               assert data.equals(table)
E               assert False
E                +  where False = equals(pyarrow.Table\nsome_dicts: dictionary<values=string, indices=int64, ordered=0>\n----\nsome_dicts: [  -- dictionary:\n["foo...\n[1,0,null],  -- dictionary:\n["foo","baz","quux"]  -- indices:\n[2,1],  -- dictionary:\n["bar","qux"]  -- indices:\n[0,1]])
E                +    where equals = pyarrow.Table\nsome_dicts: dictionary<values=string, indices=int64, ordered=0>\n----\nsome_dicts: [  -- dictionary:\n["foo...ull],  -- dictionary:\n["foo","baz","quux"]  -- indices:\n[2,1],  -- dictionary:\n["foo","baz","quux"]  -- indices:\n[0,1]].equals

pyarrow/tests/test_flight.py:1319: AssertionError
================================================================================== short test summary info ===================================================================================
FAILED pyarrow/tests/test_flight.py::test_flight_do_get_dicts - assert False
FAILED pyarrow/tests/test_flight.py::test_flight_domain_socket - assert False

```

### Are there any user-facing changes?

No

* GitHub Issue: apache#45055

Authored-by: Raúl Cumplido <raulcumplido@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
### Rationale for this change
Please see Github Issue apache#47123 

### What changes are included in this PR?
Added public Type Enums that mimic the original private variable groups used for internal type checking.

### Are these changes tested?
Yes. Partly for now.

### Are there any user-facing changes?
No, just additional features were added: they will now able to access the underlying types directly via the Type Enums. 

* GitHub Issue: apache#47123

Lead-authored-by: Bogdan Romenskii <rmnsk@seznam.cz>
Co-authored-by: Alenka Frim <AlenkaF@users.noreply.github.com>
Signed-off-by: AlenkaF <frim.alenka@gmail.com>
… sync (apache#47194)

### Rationale for this change

Unnecessary files being synced and causing portability warnings

### What changes are included in this PR?

Don't sync 'em

### Are these changes tested?

Nope

### Are there any user-facing changes?

Nope
* GitHub Issue: apache#47193

Authored-by: Nic Crane <thisisnic@gmail.com>
Signed-off-by: Nic Crane <thisisnic@gmail.com>
apache#47192)

### Rationale for this change

GCS wasn't on by default on MacOS source builds

### What changes are included in this PR?

Turn it on

### Are these changes tested?

Nah

### Are there any user-facing changes?

Nope
* GitHub Issue: apache#47191

Authored-by: Nic Crane <thisisnic@gmail.com>
Signed-off-by: Nic Crane <thisisnic@gmail.com>
…linux-devel (apache#47212)

### Rationale for this change
We need to disable these at buildtime, so that they are *also* disabled on CRAN.

Resolves apache#47211

### What changes are included in this PR?
Remove the env var set in CI, set the flag when building

### Are these changes tested?
They are tests

### Are there any user-facing changes?
Hopefully no!

* GitHub Issue: apache#47211

Authored-by: Jonathan Keane <jkeane@gmail.com>
Signed-off-by: Jonathan Keane <jkeane@gmail.com>
…nt variable (apache#47181)

### Rationale for this change

We have many environment variables for GitHub token: `GH_TOKEN`, `ARROW_GITHUB_API_TOKEN` and `CROSSBOW_GITHUB_TOKEN`

It's difficult to maintain. For example, we may forget to define one of them. 

### What changes are included in this PR?

Use `GH_TOKEN` as unified environment variable for GitHub token.

We can still use `ARROW_GITHUB_API_TOKEN` and `CROSSBOW_GITHUB_TOKEN` for backward compatibility but `GH_TOKEN` is recommended.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#47075

Lead-authored-by: Sutou Kouhei <kou@clear-code.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Co-authored-by: Bryce Mecum <petridish@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
### Rationale for this change

We need xsimd 13.0.0 or later since apache#46963 .

### What changes are included in this PR?

Require 13.0.0 or later for system xsimd.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#47175

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
… Apache Thrift (apache#47209)

### Rationale for this change

If we use bundled Apache Thrift, `libarrowd.a` not `libarrow.a` may be built.
Because Apache Thrift may change `CMAKE_DEBUG_POSTFIX`.

### What changes are included in this PR?

Restore `CMAKE_DEBUG_POSTFIX` changed by Apache Thrift.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.
* GitHub Issue: apache#47203

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…attribute ARROW:distinct_count:approximate (apache#47183)

### Rationale for this change

Implement support for the `ARROW:distinct_count:approximate` statistics attribute.

### What changes are included in this PR?

Changed the type of `arrow::ArrayStatistics::distinct_count` to support both `double` (for approximate values) and `int64_t` (for exact values).

### Are these changes tested?

Yes, I ran the corresponding unit test.

### Are there any user-facing changes?

Yes.

* C++: The type of `arrow::ArrayStatistics::distinct_count` was changed from `std::optional<int64_t>` to `std::optional<std::variant<int64_t, double>>`.
* GLib: `garrow_array_statistics_get_distinct_count()` is deprecated.

* GitHub Issue: apache#47101

Lead-authored-by: Arash Andishgar <arashandishgar1@gmail.com>
Co-authored-by: Sutou Kouhei <kou@clear-code.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Co-authored-by: Rok Mihevc <rok@mihevc.org>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…nt WriteRecordBatch (apache#47129)

### Rationale for this change

Throttle is accessed twice - once in Acquire and again using future. As a result current_value_ may not be increased due to throttle being applied and shorty after the returned future may become finished. That leads to issue described in apache#47124 
https://github.com/apache/arrow/blob/c8fe26898ce49c58514f511be58afddce176826b/cpp/src/arrow/dataset/dataset_writer.cc#L682-L684

### What changes are included in this PR?

Change throttle API to return optional (akin to [ThrottledAsyncTaskScheduler ::Throttle](https://github.com/gitmodimo/arrow/blob/3ebe7ee1828793d0a619bcd773eb4d990ccb6b3c/cpp/src/arrow/util/async_util.h#L243)) and prevent race.

### Are these changes tested?

Yes

### Are there any user-facing changes?

No

* GitHub Issue: apache#47124

Lead-authored-by: Rafał Hibner <rafal.hibner@secom.com.pl>
Co-authored-by: gitmodimo <g.modimo@gmail.com>
Co-authored-by: Rossi Sun <zanmato1984@gmail.com>
Signed-off-by: Rossi Sun <zanmato1984@gmail.com>
apache#47017)

### Rationale for this change
Fix negative timestamps from arrow to date structs when using flightsql odbc.

### Are these changes tested?
Yes

### Are there any user-facing changes?
No

* GitHub Issue: apache#47016

Authored-by: Igor Antropov <igor.antropov@dremio.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
### Rationale for this change

Add previous version of R package to backward compatibility job so we can ensure we don't make breaking changes

### What changes are included in this PR?

Add 21.0.0 to CI job

### Are these changes tested?

No

### Are there any user-facing changes?

No

Authored-by: Nic Crane <thisisnic@gmail.com>
Signed-off-by: Jacob Wujciak-Jens <jacob@wujciak.de>
@BwL1289 BwL1289 merged commit 670eb79 into main Aug 11, 2025
36 of 77 checks passed
@github-actions
Copy link

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.