Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8342040: Further improve entry lookup performance for multi-release JARs #21489

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

eirbjo
Copy link
Contributor

@eirbjo eirbjo commented Oct 14, 2024

Please review this PR which speeds up JarFile::getEntry lookup significantly for multi-release JAR files.

The changes in this PR are motivated by the following insights:

  • META-INF/versions/ is sparsely populated.
  • Most entries are not versioned
  • The number of unique versions for each versioned entry is small
  • Many JAR files are 'accidentally' multi-release; they use the feature to hide module-info.class from Java 8.

Instead of performing one lookup for every version identified in the JAR, this PR narrows the version search down to an approximation of the number of versions found for the entry being looked up, which will often be zero. This speeds up lookup for non-versioned entries, and provides a more targeted search for versioned entries.

An alternative approach could be to normalize the hash code to use the none-versioned name such that versioned and non-versioned names would be resolved in the same lookup. This was quickly abandoned since the code changes were intrusive and mixed too many JAR specific concerns into ZipFile.

Testing: The existing JarFileGetEntry benchmark is updated to optionally test a multi-release JAR file with one versioned entry for module-info.class plus two other versioned class files for two distinct versions. Performance results in first comment.

Running ZipFileOpen on a multi-release JAR did not show a significat difference between this PR and mainline.

The JAR and ZIP tests are run locally. GHA results green. The noreg-perf label is added in JBS.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8342040: Further improve entry lookup performance for multi-release JARs (Enhancement - P4)

Reviewers

Contributors

  • Claes Redestad <redestad@openjdk.org>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/21489/head:pull/21489
$ git checkout pull/21489

Update a local copy of the PR:
$ git checkout pull/21489
$ git pull https://git.openjdk.org/jdk.git pull/21489/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 21489

View PR using the GUI difftool:
$ git pr show -t 21489

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/21489.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 14, 2024

👋 Welcome back eirbjo! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Oct 14, 2024

@eirbjo This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8342040: Further improve entry lookup performance for multi-release JARs

Co-authored-by: Claes Redestad <redestad@openjdk.org>
Reviewed-by: redestad

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 149 new commits pushed to the master branch:

  • 28252bb: 8341444: Unnecessary check for JSRs in CDS
  • 8174cbd: 8341978: Improve JButton/bug4490179.java
  • 9201e9f: 8342409: [s390x] C1 unwind_handler fails to unlock synchronized methods with LM_MONITOR
  • 0963b9e: 8341664: ReferenceClassDescImpl cache internalName
  • c51a086: 8339694: ciTypeFlow does not correctly handle unresolved constant dynamic of array type
  • 7f4ed50: 8341020: Error handler crashes when Metaspace is not fully initialized
  • f50bd0d: 8341513: Remove the unused thread_type field from OSThread
  • 7a16906: 8341134: Deprecate for removal the jrunscript tool
  • ffe6091: 8173970: jar tool should have a way to extract to a directory
  • 2b03dbd: 8311530: Deprecate jdk.jsobject module for removal
  • ... and 139 more: https://git.openjdk.org/jdk/compare/3fba1702cd8dc817b11bfa51077c41424d289281...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk
Copy link

openjdk bot commented Oct 14, 2024

@eirbjo The following labels will be automatically applied to this pull request:

  • core-libs
  • security

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added security security-dev@openjdk.org core-libs core-libs-dev@openjdk.org labels Oct 14, 2024
@eirbjo
Copy link
Contributor Author

eirbjo commented Oct 14, 2024

Performance results:

Baseline:

Benchmark                              (mr)  (size)  Mode  Cnt   Score   Error  Units
JarFileGetEntry.getEntryHit           false    1024  avgt   15   64.619 ? 6.633  ns/op
JarFileGetEntry.getEntryHit            true    1024  avgt   15  301.770 ? 4.819  ns/op
JarFileGetEntry.getEntryHitUncached   false    1024  avgt   15   82.030 ? 2.057  ns/op
JarFileGetEntry.getEntryHitUncached    true    1024  avgt   15  327.966 ? 3.092  ns/op
JarFileGetEntry.getEntryMiss          false    1024  avgt   15   27.937 ? 0.982  ns/op
JarFileGetEntry.getEntryMiss           true    1024  avgt   15  267.280 ? 2.196  ns/op
JarFileGetEntry.getEntryMissUncached  false    1024  avgt   15   53.214 ? 1.085  ns/op
JarFileGetEntry.getEntryMissUncached   true    1024  avgt   15  290.904 ? 2.359  ns/op

PR:

Benchmark                              (mr)  (size)  Mode  Cnt   Score   Error  Units
JarFileGetEntry.getEntryHit           false    1024  avgt   15  63.311 ? 3.014  ns/op
JarFileGetEntry.getEntryHit            true    1024  avgt   15  71.333 ? 1.510  ns/op
JarFileGetEntry.getEntryHitUncached   false    1024  avgt   15  80.401 ? 0.516  ns/op
JarFileGetEntry.getEntryHitUncached    true    1024  avgt   15  94.810 ? 1.136  ns/op
JarFileGetEntry.getEntryMiss          false    1024  avgt   15  26.717 ? 0.276  ns/op
JarFileGetEntry.getEntryMiss           true    1024  avgt   15  45.677 ? 0.412  ns/op
JarFileGetEntry.getEntryMissUncached  false    1024  avgt   15  53.122 ? 0.954  ns/op
JarFileGetEntry.getEntryMissUncached   true    1024  avgt   15  66.583 ? 0.425  ns/op

@eirbjo eirbjo marked this pull request as ready for review October 14, 2024 14:33
@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 14, 2024
@mlbridge
Copy link

mlbridge bot commented Oct 14, 2024

Webrevs

if (metaVersionsMap != null) {
metaVersions = new HashMap<>();
for (var entry : metaVersionsMap.entrySet()) {
// Convert TreeSet<Integer> to int[] for performance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you just require order upon final freezing, you can just use a HashSet or ArrayList, and perform the sorting and deduplication when you compile to an int array.

Copy link
Contributor Author

@eirbjo eirbjo Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we trust that versioned entries are rare, this should not matter much either way. But I pushed a commit which uses HashSet and Arrays::sort on freezing instead. WDYT?

Given that most versioned entries will only have a single version, we may save some memory and speed by special-casing for sets with one or two versions:

switch (metaVersionsMap.get(name)) {
    case null -> {
        // Special-case for set of size 1
        metaVersionsMap.put(name, Set.of(version));
    }
    case Set<Integer> versions when versions.size() == 1 -> {
        // Special-case for set of size 2
        metaVersionsMap.put(name, Set.of(version, versions.iterator().next()));
    }
    case Set<Integer> versions when versions.size() == 2 -> {
        // Now we convert to HashSet
        HashSet<Integer> set = new HashSet<>(versions);
        set.add(version);
        metaVersionsMap.put(name, set);
    }
    case HashSet<Integer> versions -> {
        // Just add
        versions.add(version);
    }
    default -> throw new AssertionError("Illegal state");
}

But again, versioned entries are rare, so perhaps better to keep it simple. Does @cl4es have thoughts about this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like simple.

Mixing Set.of and HashSet means a lot of semantic fu-fu (throwing/not throwing on nulls) and risks being suboptimal due making some call sites megamorphic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, looks good. And I agree with Claes' evaluation that we should not complicate this construction further, especially that the result is quickly garbage collected when we compile into the string to int array map.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also just use Map<String, int[]>. This would allow us to skip the whole Set<Integer> to int[] transition step, and the temporary HashMap is no longer needed.

Deduplication could occur during registration:

metaVersions.merge(name, new int[] {version}, (current, val) -> {
    // Check duplicates
    for (int v : current) {
        if (v == version) {
            return current;
        }
    }
    // Add version
    int[] merged = Arrays.copyOf(current, current.length + 1);
    merged[merged.length - 1] = version;
    return merged;
});

while the postprocessing step would simply sort each array:

for (int[] versions : metaVersions.values()) {
    // JarFile::getVersionedEntry expects sorted versions
    Arrays.sort(versions);
}

This performs ~10% better on a version-heavy ZipFileOpen, and I think the code is reasonably simple compared to the current state of the PR. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect remaining regression is mostly from the added String creation/processing in initCEN.

Yes, there is a cost with the String construction when parsing the name out of META-INF/versions/{name}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few possible strategies for avoiding that additional parse, since the effect we're getting at here is to have a quick filter to avoid pointless lookups and not necessarily an exact mapping.

One is to store the checkedHash rather than the full String. This gets openCloseZipFile down to ~910000 ns/op. (checkedHash very hot in profiles). There's still a chance for redundant lookups on hash collisions, but this should be rare.

Another is to store a BitSet per name length. This gets ZipFileOpen down to baseline level (~670000 ns/op), but increases chance of having to do redundant lookups a lot.

Both also improves footprint (not keeping each versioned entry String in memory would be nice).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One is to store the checkedHash rather than the full String. This gets openCloseZipFile down to ~910000 ns/op. (checkedHash very hot in profiles). There's still a chance for redundant lookups on hash collisions, but this should be rare.

Seems like a resonable trade-off. Could you take a look at the latest 771488e and see if that represents your suggestion here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, 771488e matches my quick experiment (sans some cleanups you've made). I agree this variant makes for a reasonable trade-off.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there's some benefit in detecting "META-INF/versions/" in the checkAndAddEntry method and then calculate the two different hash values back to back. Hash calculations are fast on modern hardware with vector support, but the memory accesses might not be so we might get a general win from re-arranging some of this stuff. I think this is more suitable for a follow-up experiment, though.

@eirbjo
Copy link
Contributor Author

eirbjo commented Oct 18, 2024

Marking this PR as draft while investigating alternatives as proposed by @cl4es

@eirbjo eirbjo marked this pull request as draft October 18, 2024 09:17
@openjdk openjdk bot removed the rfr Pull request is ready for review label Oct 18, 2024
@openjdk
Copy link

openjdk bot commented Oct 18, 2024

⚠️ @eirbjo This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

@eirbjo
Copy link
Contributor Author

eirbjo commented Oct 18, 2024

/contributor add cl4es

@openjdk
Copy link

openjdk bot commented Oct 18, 2024

@eirbjo cl4es was not found in the census.

Syntax: /contributor (add|remove) [@user | openjdk-user | Full Name <email@address>]. For example:

  • /contributor add @openjdk-bot
  • /contributor add duke
  • /contributor add J. Duke <duke@openjdk.org>

User names can only be used for users in the census associated with this repository. For other contributors you need to supply the full name and email address.

@eirbjo
Copy link
Contributor Author

eirbjo commented Oct 18, 2024

/contributor add @cl4es

@openjdk
Copy link

openjdk bot commented Oct 18, 2024

@eirbjo
Contributor Claes Redestad <redestad@openjdk.org> successfully added.

@eirbjo eirbjo marked this pull request as ready for review October 18, 2024 13:50
@openjdk openjdk bot added the ready Pull request is ready to be integrated label Oct 18, 2024
@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 18, 2024
@eirbjo
Copy link
Contributor Author

eirbjo commented Oct 18, 2024

Bringing this back from draft mode after some feedback and exploration with @cl4es

The current version of this PR now uses a BitSet to represent versions. This speeds up version tracking in initCEN for the unusual case that there is a large number of versioned entries in the ZIP.

To summarize the performance benefits of this PR over baseline:

  • The lookup overhead of multi-release with a few versioned entries is reduced from 5X to 1.2X
  • For missing entries (very common in URLClassPath!), the overhead is reduced from 10X to 1.8X

Claes has approved this PR already, but since he contributed code and ideas to this PR, I'll let this linger until early next week to allow non-Scandinavian reviewers to have a look :-)

// Register version for this hash code
if (metaVersions == null)
metaVersions = new HashMap<>();
metaVersions.computeIfAbsent(hashCode, _ -> new BitSet()).set(version);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this hashCode ever collide? Tried reading ZipCoder and can't rule that out. If yes, we need a fallback mechanism if this computeIfAbsent encountered an existing entry.

Copy link
Contributor Author

@eirbjo eirbjo Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it may collide. This is an overapproximation of the version set where a hash collision just results in an extra lookup being performed for a version entry which does not exist for that name. See the comment on the metaVersions field declaration.

This is a trade-off against the initial alternative proposed in this PR which was to map using the entry name string. That would minimize lookups, but it would also require allocating the String name and storing those strings in the map. So we decided a few extra lookups on hash collisions was an okay trade-off for faster parsing and less memory used.

In fact, if we a use constant hash code like '1', such that all hash codes collide, we get the behavior of the current mainline, where a search for any name will result in searches being performed for each version discovered in the ZIP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org ready Pull request is ready to be integrated rfr Pull request is ready for review security security-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

3 participants