GH-721: Allow using 1GB+ data buffers in variable width vectors#722
Merged
lidavidm merged 4 commits intoapache:mainfrom Apr 22, 2025
Merged
GH-721: Allow using 1GB+ data buffers in variable width vectors#722lidavidm merged 4 commits intoapache:mainfrom
lidavidm merged 4 commits intoapache:mainfrom
Conversation
Allow actually reaching MAX_BUFFER_SIZE at reallocating variable width vectors instead of exceeding it calculating the next power of 2.
This comment has been minimized.
This comment has been minimized.
lidavidm
reviewed
Apr 22, 2025
Member
lidavidm
left a comment
There was a problem hiding this comment.
Is it possible to write a test for this?
Contributor
Author
I think we already have tests related to reallocation. The contract of the related methods did not change related to this change. Do you think we still need to test that we are allocating between 1GB and 2GB? BTW, do you know what the actual failure is about "Ensure PR format"? |
Member
|
You can ignore that check for now. I ask because previously allocating between 1 and 2 GB didn't work, so a regression test would be useful to provide. |
Contributor
Author
|
@lidavidm added a unit test. Let's see if the decreasing of the default max allocation size by 1 byte would cause any harm in the other tests. |
Contributor
Author
|
Thank you, @lidavidm! |
dongjoon-hyun
pushed a commit
to apache/spark
that referenced
this pull request
May 15, 2025
### What changes were proposed in this pull request? This pr aims to upgrade `arrow-java` from 18.2.0 to 18.3.0. ### Why are the changes needed? The new version bring some bug fixes, like: - apache/arrow-java#627 - apache/arrow-java#654 - apache/arrow-java#656 - apache/arrow-java#693 - apache/arrow-java#705 - apache/arrow-java#707 - apache/arrow-java#722 In addition, the new version introduces a cascading upgrade for flatbuffers-java([ from 24.3.25 to 25.1.24 ](apache/arrow-java#600)) the full release note as follows: - https://github.com/apache/arrow-java/releases/tag/v18.3.0 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass GitHub Acitons ### Was this patch authored or co-authored using generative AI tooling? No Closes #50892 from LuciferYang/arrow-java-18.3.0. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
yhuang-db
pushed a commit
to yhuang-db/spark
that referenced
this pull request
Jun 9, 2025
### What changes were proposed in this pull request? This pr aims to upgrade `arrow-java` from 18.2.0 to 18.3.0. ### Why are the changes needed? The new version bring some bug fixes, like: - apache/arrow-java#627 - apache/arrow-java#654 - apache/arrow-java#656 - apache/arrow-java#693 - apache/arrow-java#705 - apache/arrow-java#707 - apache/arrow-java#722 In addition, the new version introduces a cascading upgrade for flatbuffers-java([ from 24.3.25 to 25.1.24 ](apache/arrow-java#600)) the full release note as follows: - https://github.com/apache/arrow-java/releases/tag/v18.3.0 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass GitHub Acitons ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#50892 from LuciferYang/arrow-java-18.3.0. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
timhurskidremio
added a commit
to timhurskidremio/dremio-arrow
that referenced
this pull request
Nov 3, 2025
timhurskidremio
added a commit
to timhurskidremio/dremio-arrow
that referenced
this pull request
Nov 3, 2025
timhurskidremio
added a commit
to timhurskidremio/dremio-arrow
that referenced
this pull request
Nov 3, 2025
timhurskidremio
added a commit
to timhurskidremio/dremio-arrow
that referenced
this pull request
Nov 3, 2025
timhurskidremio
added a commit
to timhurskidremio/dremio-arrow
that referenced
this pull request
Nov 3, 2025
timhurskidremio
added a commit
to timhurskidremio/dremio-arrow
that referenced
this pull request
Nov 10, 2025
timhurskidremio
added a commit
to timhurskidremio/dremio-arrow
that referenced
this pull request
Nov 10, 2025
timhurskidremio
added a commit
to dremio/arrow
that referenced
this pull request
Nov 10, 2025
timhurskidremio
added a commit
to dremio/arrow
that referenced
this pull request
Nov 10, 2025
timhurskidremio
added a commit
to dremio/arrow
that referenced
this pull request
Nov 10, 2025
timhurskidremio
added a commit
to dremio/arrow
that referenced
this pull request
Nov 10, 2025
timhurskidremio
pushed a commit
to timhurskidremio/dremio-arrow-java
that referenced
this pull request
Dec 5, 2025
…apache#722) ## What's Changed Allow actually reaching MAX_BUFFER_SIZE at reallocating variable width vectors instead of exceeding it calculating the next power of 2. For unit testing the maximum allocation size has be increased to 2MB - 1byte to simulate the default maximum behavior. Due to this change needed some updates in existing unit tests because of the round ups used at calculating the required buffer sizes. Closes apache#721.
jbonofre
pushed a commit
that referenced
this pull request
Jan 12, 2026
Bumps [org.apache.commons:commons-text](https://github.com/apache/commons-text) from 1.13.1 to 1.15.0. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/apache/commons-text/blob/master/RELEASE-NOTES.txt">org.apache.commons:commons-text's changelog</a>.</em></p> <blockquote> <h2>Apache Commons Text 1.15.0 Release Notes</h2> <p>The Apache Commons Text team is pleased to announce the release of Apache Commons Text 1.15.0.</p> <p>Apache Commons Text is a set of utility functions and reusable components for processing and manipulating text in a Java environment.</p> <p>Release 1.15.0. This is a feature and maintenance release. Java 8 or later is required.</p> <h2>New features</h2> <ul> <li> <pre><code> Add experimental CycloneDX VEX file [#683](apache/commons-text#683). Thanks to Piotr P. Karwasz, Gary Gregory. </code></pre> </li> <li>TEXT-235: Add Damerau-Levenshtein distance <a href="https://redirect.github.com/apache/commons-text/issues/687">#687</a>. Thanks to LorgeN, Gary Gregory.</li> <li> <pre><code> Add unit tests to increase coverage [#719](apache/commons-text#719). Thanks to Michael Hausegger, Gary Gregory. </code></pre> </li> <li> <pre><code> Add new test for CharSequenceTranslator#with() [#725](apache/commons-text#725). Thanks to Michael Hausegger, Gary Gregory. </code></pre> </li> <li> <pre><code> Add tests and assertions to org.apache.commons.text.similarity to get to 100% code coverage [#727](apache/commons-text#727), [#728](apache/commons-text#728). Thanks to Michael Hausegger. </code></pre> </li> </ul> <h2>Fixed Bugs</h2> <ul> <li> <pre><code> Fix exception message typo in XmlStringLookup.XmlStringLookup(Map, Path...). Thanks to Gary Gregory. </code></pre> </li> <li>TEXT-236: Inserting at the end of a TextStringBuilder throws a StringIndexOutOfBoundsException. Thanks to Pierre Post, Sumit Bera, Alex Herbert, Gary Gregory.</li> <li> <pre><code> Fix TextStringBuilderTest.testAppendToCharBuffer() to use proper argument type [#724](apache/commons-text#724). Thanks to Michael Hausegger. </code></pre> </li> <li> <pre><code> Fix Apache RAT plugin console warnings. Thanks to Gary Gregory. </code></pre> </li> <li> <pre><code> Fix site XML to use version 2.0.0 XML schema. Thanks to Gary Gregory. </code></pre> </li> <li> <pre><code> Removed unreachable threshold verification code in src/main/java/org/apache/commons/text/similarity [#730](apache/commons-text#730). Thanks to Michael Hausegger. </code></pre> </li> <li> <pre><code> Enable secure processing for the XML parser in XmlStringLookup in case the underlying JAXP implementation doesn't [#729](apache/commons-text#729). Thanks to 김민재 (minjas0507), Gary Gregory, Piotr Karwasz. </code></pre> </li> </ul> <h2>Changes</h2> <ul> <li> <pre><code> Bump org.apache.commons:commons-parent from 85 to 93 [#704](apache/commons-text#704), [#723](apache/commons-text#723), [#726](apache/commons-text#726). Thanks to Gary Gregory. </code></pre> </li> <li> <pre><code> Bump commons.bytebuddy.version from 1.17.6 to 1.18.2 [#696](apache/commons-text#696), [#722](apache/commons-text#722). Thanks to Gary Gregory. </code></pre> </li> <li> <pre><code> Bump graalvm.version from 24.2.2 to 25.0.1 [#703](apache/commons-text#703), [#716](apache/commons-text#716). Thanks to Gary Gregory, Dependabot. </code></pre> </li> <li> <pre><code> Bump org.apache.commons:commons-lang3 from 3.18.0 to 3.20.0. Thanks to Gary Gregory. </code></pre> </li> <li> <pre><code> Bump commons-io:commons-io from 2.20.0 to 2.21.0. Thanks to Gary Gregory. </code></pre> </li> </ul> <p>Historical list of changes: <a href="https://commons.apache.org/proper/commons-text/changes.html">https://commons.apache.org/proper/commons-text/changes.html</a></p> <p>For complete information on Apache Commons Text, including instructions on how to submit bug reports, patches, or suggestions for improvement, see the Apache Commons Text website:</p> <p><a href="https://commons.apache.org/proper/commons-text">https://commons.apache.org/proper/commons-text</a></p> <p>Download page: <a href="https://commons.apache.org/proper/commons-text/download_text.cgi">https://commons.apache.org/proper/commons-text/download_text.cgi</a></p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/apache/commons-text/commit/04e937470d3679cc163df85d82d5b6d2e3e71128"><code>04e9374</code></a> Prepare for the release candidate 1.15.0 RC1</li> <li><a href="https://github.com/apache/commons-text/commit/502c4c41be5671681b58a9b50297f99737e8ea93"><code>502c4c4</code></a> Prepare for the next release candidate</li> <li><a href="https://github.com/apache/commons-text/commit/c6e17ec24cc8374eb12676b717bf797f41b6e539"><code>c6e17ec</code></a> Use direct access</li> <li><a href="https://github.com/apache/commons-text/commit/58e1e125daaa0aebf8c5ffaa82af48821a1ccf2d"><code>58e1e12</code></a> Simplify XML FSP (<a href="https://redirect.github.com/apache/commons-text/issues/731">#731</a>)</li> <li><a href="https://github.com/apache/commons-text/commit/b5052c97e84e1c174ec8bfbbb749e33f22917a07"><code>b5052c9</code></a> Bump actions/setup-java from 5.0.0 to 5.1.0</li> <li><a href="https://github.com/apache/commons-text/commit/2e2d4bc90f1b3274e7943ac27d037d47c0cc098d"><code>2e2d4bc</code></a> Revert "Bump actions/setup-java from 5.0.0 to 5.1.0"</li> <li><a href="https://github.com/apache/commons-text/commit/b0ddbd17bbeee12ad33b8a61c60b4edbe6c85838"><code>b0ddbd1</code></a> Bump actions/setup-java from 5.0.0 to 5.1.0</li> <li><a href="https://github.com/apache/commons-text/commit/1c2d3821e67e08342b8cef4d4445c30b4a22daca"><code>1c2d382</code></a> Add tests with external DTD</li> <li><a href="https://github.com/apache/commons-text/commit/ed3df4b25cd5301921a6523ae7db2411f4a84d98"><code>ed3df4b</code></a> Internal clean up</li> <li><a href="https://github.com/apache/commons-text/commit/bb508f304a8835ac2319af1d872b2f1a9ff6f81d"><code>bb508f3</code></a> Bump actions/checkout from 6.0.0 to 6.0.1</li> <li>Additional commits viewable in <a href="https://github.com/apache/commons-text/compare/rel/commons-text-1.13.1...rel/commons-text-1.15.0">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What's Changed
Allow actually reaching MAX_BUFFER_SIZE at reallocating variable width vectors instead of exceeding it calculating the next power of 2.
For unit testing the maximum allocation size has be increased to 2MB - 1byte to simulate the default maximum behavior. Due to this change needed some updates in existing unit tests because of the round ups used at calculating the required buffer sizes.
Closes #721.