Skip to content

Improve DeltaList behavior for large projects #335

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: maven-compiler-plugin-3.x
Choose a base branch
from

Conversation

gsmet
Copy link

@gsmet gsmet commented Jun 3, 2025

This pull request targets the 3.x branch as I think this support has been rewritten in 4.

Here is the situation prior to this patch with a massive project (~40k source files):

Screenshot from 2025-06-03 17-17-35

7172 samples, 3.79%

The first commit keeps the ordering as is and make sure we write the file sorted and the outputs of DeltaList are sorted.
The behavior is exactly the same as before.

Here is the situation after the first commit:

Screenshot from 2025-06-03 17-17-27

78 samples, 0.04% - basically, the only thing left is the tiny thing you can see at the far right of the previous flame graph.

The second commit is a further optimization, which might not be useful and only sorts when displaying the output. The file is written unsorted.
It doesn't change much tbh so we might as well not include the second commit.
I did the work to be comprehensive so I thought I would present it.

Here is the situation after the second commit:

Screenshot from 2025-06-03 17-17-43

59 samples, 0.04%

Opening as a draft so that we decide what to do about the second commit: my advice would be to drop it but YMMV

BTW, it might be possible to optimize further but given this code is readable and give good results, I think it's good enough.

Fixes #334


To make clear that you license your contribution under
the Apache License Version 2.0, January 2004
you have to acknowledge this by using the following check-box.

gsmet added 2 commits June 3, 2025 17:21
I have been experimenting with a large code base to try to improve
Quarkus build time and I stumbled upon the fact that DeltaList was very
inefficient in these cases.

In this commit, I keep the ordering even if not actually important. I
will push further optimizations in a second commit that we can discuss
further.

Fixes apache#334
It doesn't seem important that the file is sorted, and the output of
DeltaList is only sorted if we choose to display it.

Related to apache#334
@gsmet gsmet changed the title Improve delta list Improve DeltaList behavior for large projects Jun 3, 2025
@gsmet gsmet marked this pull request as draft June 3, 2025 15:32
Comment on lines +36 to +37
this.added = newList.stream().filter(i -> !oldList.contains(i)).collect(Collectors.toList());
this.removed = oldList.stream().filter(i -> !newList.contains(i)).collect(Collectors.toList());
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to use HashSets and avoid copying then removing a lot of elements.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be better to use the old for loop instead of streams?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could sure.

@gsmet
Copy link
Author

gsmet commented Jun 13, 2025

hey @jorsol , any chance you could have a look at this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants