Skip to content

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

This is a surgical fix extracted from #49163

The default jackson string limit introduced in jackson 2.15 can be too small for certain workloads, and this PR removes this limitation to avoid any regression.

Why are the changes needed?

fix regression

Does this PR introduce any user-facing change?

Yes, users won't hit this size limitation anymore.

How was this patch tested?

#49163 tested it. We won't add a test in this PR as generating a super large JSON will make the CI unstable.

Was this patch authored or co-authored using generative AI tooling?

no

@github-actions github-actions bot added the CORE label Aug 16, 2025
@cloud-fan
Copy link
Contributor Author


// SPARK-49872: Remove jackson JSON string length limitation.
mapper.getFactory.setStreamReadConstraints(
StreamReadConstraints.builder().maxStringLength(Int.MaxValue).build()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you fix the compilation failure, @cloud-fan ?

[error] /home/runner/work/spark/spark/core/src/main/scala/org/apache/spark/util/JsonProtocol.scala:71:5: not found: value StreamReadConstraints
[error]     StreamReadConstraints.builder().maxStringLength(Int.MaxValue).build()
[error]     ^
[error] one error found

@cloud-fan cloud-fan changed the title [SPARK-49872][CORE] Remove jackson JSON string length limitation. [SPARK-49872][CORE] Remove jackson JSON string length limitation Aug 18, 2025
@cloud-fan
Copy link
Contributor Author

The docker failure is unrelated, thanks for the review, merging to master/4.0/3.5!

@cloud-fan cloud-fan closed this in 076618a Aug 19, 2025
cloud-fan added a commit that referenced this pull request Aug 19, 2025
### What changes were proposed in this pull request?

This is a surgical fix extracted from #49163

The default jackson string limit introduced in jackson 2.15 can be too small for certain workloads, and this PR removes this limitation to avoid any regression.

### Why are the changes needed?

fix regression

### Does this PR introduce _any_ user-facing change?

Yes, users won't hit this size limitation anymore.

### How was this patch tested?

#49163 tested it. We won't add a test in this PR as generating a super large JSON will make the CI unstable.

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #52049 from cloud-fan/json.

Lead-authored-by: Wenchen Fan <wenchen@databricks.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 076618a)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
cloud-fan added a commit that referenced this pull request Aug 19, 2025
This is a surgical fix extracted from #49163

The default jackson string limit introduced in jackson 2.15 can be too small for certain workloads, and this PR removes this limitation to avoid any regression.

fix regression

Yes, users won't hit this size limitation anymore.

#49163 tested it. We won't add a test in this PR as generating a super large JSON will make the CI unstable.

no

Closes #52049 from cloud-fan/json.

Lead-authored-by: Wenchen Fan <wenchen@databricks.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 076618a)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
turboFei pushed a commit to turboFei/spark that referenced this pull request Nov 6, 2025
…tps://github.corp.ebay.com/carmel/ebay-spark/pull/863 (apache#22)

This is a surgical fix extracted from
apache#49163

The default jackson string limit introduced in jackson 2.15 can be too
small for certain workloads, and this PR removes this limitation to
avoid any regression.

fix regression

Yes, users won't hit this size limitation anymore.

apache#49163 tested it. We won't add a
test in this PR as generating a super large JSON will make the CI
unstable.

no

Closes apache#52049 from cloud-fan/json.

Lead-authored-by: Wenchen Fan <wenchen@databricks.com>


(cherry picked from commit 076618a)

Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 14, 2025
### What changes were proposed in this pull request?

This is a surgical fix extracted from apache#49163

The default jackson string limit introduced in jackson 2.15 can be too small for certain workloads, and this PR removes this limitation to avoid any regression.

### Why are the changes needed?

fix regression

### Does this PR introduce _any_ user-facing change?

Yes, users won't hit this size limitation anymore.

### How was this patch tested?

apache#49163 tested it. We won't add a test in this PR as generating a super large JSON will make the CI unstable.

### Was this patch authored or co-authored using generative AI tooling?

no

Closes apache#52049 from cloud-fan/json.

Lead-authored-by: Wenchen Fan <wenchen@databricks.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit a76c1a4)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants