Skip to content

Conversation

danielogh
Copy link
Contributor

@danielogh danielogh commented Aug 8, 2025

This PR addresses a bug in the stringopts phase. During string concatenation, repeated stacking of concatenations can lead to excessive compilation resource use and generation of questionable code as the merging of two StringBuilder-append-toString links sc1 and sc2 can result in a new StringBuilder with the size sc1->num_arguments() * sc2->num_arguments().

In the attached test, the size of the successively merged StringBuilder doubles on each merge -- there's 24 of them -- as the toString result of the first component is used twice in the second component [1], etc. Not only does the compiler hang on this test case, but the string concat optimization seems to give an arbitrary amount of back-to-back stores in the generated code depending on the number of stacked concatenations.

The proposed solution is to put an upper bound on the size of a merged concatenation, which guards against this case of repeated concatenations on the same string variable, and potentially other edge cases. 100 seems like a generous limit, and higher limits could be insufficient as each argument corresponds to about 20 new nodes later in replace_string_concat [2].

[1]

for (int y = 0; y < other->num_arguments(); y++) {

[2]

for (int argi = 0; argi < sc->num_arguments(); argi++) {

Testing: T1-4.

Extra testing: verified that no method in T1-4 is being compiled with a merged concat candidate exceeding the suggested limit of 100 aguments, regardless of whether or not the later checks verify_control_flow() and verify_mem_flow pass.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8362394: C2: Repeated stacked string concatenation fails with "Hit MemLimit" and other resourcing errors (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/26685/head:pull/26685
$ git checkout pull/26685

Update a local copy of the PR:
$ git checkout pull/26685
$ git pull https://git.openjdk.org/jdk.git pull/26685/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 26685

View PR using the GUI difftool:
$ git pr show -t 26685

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/26685.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Aug 8, 2025

👋 Welcome back dskantz! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Aug 8, 2025

@danielogh This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8362394: C2: Repeated stacked string concatenation fails with "Hit MemLimit" and other resourcing errors

Reviewed-by: rcastanedalo

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 345 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk
Copy link

openjdk bot commented Aug 8, 2025

@danielogh The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added hotspot-compiler hotspot-compiler-dev@openjdk.org rfr Pull request is ready for review labels Aug 8, 2025
@mlbridge
Copy link

mlbridge bot commented Aug 8, 2025

Webrevs

// sc->num_arguments() * other->num_arguments(),
// which is a problem in the case of repeated stacked concats.
// Put a limit at 100 arguments to guard against excessive resource use.
bool n_args_is_bounded = merged->num_arguments() < 100;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check can be done in the merge method instead. It's also possible to increase the bound but it will need a live node check later in replace_string_concat.

Copy link
Contributor

@robcasloz robcasloz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, Daniel! I have a few comments, mostly about the test.

// merge(other, arg) can return a concatenation of size
// sc->num_arguments() * other->num_arguments(),
// which is a problem in the case of repeated stacked concats.
// Put a limit at 100 arguments to guard against excessive resource use.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you extract this limit into a constant and give it a meaningful name?

Comment on lines 40 to 42
for (int i = 0; i < 10; i++) {
String s = f(" ");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be simplified into:

Suggested change
for (int i = 0; i < 10; i++) {
String s = f(" ");
}
new StringBuilder(); // Trigger loading of the StringBuilder class.
String s = f(" ");

Then you can narrow down the CompileOnly command to -XX:CompileOnly=compiler.stringopts.TestStackedConcatsMany::f.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to assert that f(" ") returns the expected result.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider running the test with -XX:-OptoScheduling, as an alternative to problem listing? This should simply avoid the problematic assertion and would be more robust w.r.t. other platforms.

* @test
* @bug 8357105
* @summary Test that repeated stacked string concatenations do not
* consume too many compilation resources.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reasonable way to enhance the test to validate excessive resources? I'm not sure if the following example would work, but I'm wondering if there is something that can be measured deterministically. E.g. before with the given test there would be ~N IR nodes produced but now it would be a max of ~M, assuming that M is deterministically smaller than N.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a 80000 node limit by default and maybe the test could use a lower limit by specifying a value for the MaxNodeLimit flag. There is also the IR framework that can check for node counts for individual nodes.

Without the fix, the test currently gets a MemLimit assert in debug runs for consuming 1GB of memory as it is building up the _arguments arrays. The high number of IR nodes is created later in replace_string_concat if we get that far without timing out or reaching the memory limit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the default node limit, memory limit and timeout for product runs may be enough to test the fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think, @galderz ? Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I had forgotten to 👍 before, all good with your reply

Copy link
Contributor

@robcasloz robcasloz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick study of C2 compile time vs. arguments_appended by running TestStackedConcatsMany with different number of s = new StringBuilder().append(s).append(s).toString(); lines, and got the following results in my machine:

image

The plot suggests to me that the current limit of 100 could be relaxed without risking a resource usage explosion. 256 seems to me like a better balance in terms of enabling the optimization in more scenarios while still being far away from the explosion point.

I also have a fewer follow-up comments and requests, mostly about style.

@@ -295,6 +295,9 @@ StringConcat* StringConcat::merge(StringConcat* other, Node* arg) {
}
assert(result->_control.contains(other->_end), "what?");
assert(result->_control.contains(_begin), "what?");

const int concat_argument_upper_bound = 100;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for lifting this to a named constant. For consistency with the style guide, I think it would be better to use upper case or mixed case. Consider also making this constant a static member of StringConcat. Finally, you could also declare both the constant and arguments_appended as uint.

* @summary Test that repeated stacked string concatenations do not
* consume too many compilation resources.
* @requires vm.compiler2.enabled
* @run main/othervm -XX:-OptoScheduling compiler.stringopts.TestStackedConcatsMany
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment explaining why -XX:-OptoScheduling is used.


public static void main (String... args) {
new StringBuilder(); // Trigger loading of the StringBuilder class.
String s = f(); // warmup call
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this warm-up call? Isn't it enough with calling f once?

Comment on lines 48 to 50
if (!(s.equals(z))) {
throw new RuntimeException("wrong result.");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using jdk.test.lib.Asserts.assertEQ is preferable here.

Copy link
Contributor

@robcasloz robcasloz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my comments, Daniel! Please re-test to ensure the new limit is OK on all Oracle's internal test configurations.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Aug 12, 2025
@danielogh
Copy link
Contributor Author

danielogh commented Aug 18, 2025

Thanks @robcasloz for the review. Great to increase the upper bound with compilation time measurements. Tests still look good to me.

// Check if this concatenation would result in an excessive number of arguments
// -- leading to high memory use, compilation time, and later, a large number of IR nodes
// -- and bail out in that case.
if (STACKED_CONCAT_UPPER_BOUND < arguments_appended) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it just me, or is it easier to read with the limit on the right?

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Aug 21, 2025
@openjdk openjdk bot added the ready Pull request is ready to be integrated label Aug 21, 2025
@danielogh
Copy link
Contributor Author

A comment to keep PR active.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org ready Pull request is ready to be integrated rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

4 participants