Skip to content

Improve UI5 performance for large codebases #186

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 3, 2025

Conversation

lcartey
Copy link
Contributor

@lcartey lcartey commented Apr 2, 2025

We observed high memory usage and slow evaluation on codebases with a large number of files and string literals.

0f554bd - addresses a join order problem when determining whether elements occur in the same web-app. We prefer to apply the inSameWebapp condition after computing all other conditions, to avoid a cross-product on data flow nodes in the same web app.

ed87aa9 - the BindingStringParser had performance issues on large string literals with a large number of tokens. This was because calculating token operations such as the next token, token containment and the first token created a cross-product on all tokens in the given string, before applying further conditions.

We avoid this problem by creating a tokenOrdering predicate that provides an ordering over all the tokens in the string. This is then used to implement the token operations more efficiently. This approach was complicated by the fact that we can have overlapping and contained tokens. Longer term, I think we would prefer to eliminate this overlap to simplify this code.

We observed this performance issue on a codebase with a large number of *.chunk.js files (which are generated by webpack code splitting). Unfortunately, these files contain a lot of large string literals containing JSON, which we currently pick up as a potential binding string. As a future improvement it would be better if we could exclude these from the binding string calculation all together, perhaps by using data flow to determine whether a string literal flows to a relevant API.

lcartey added 2 commits April 2, 2025 10:33
Applying inSameWebApp too early in the pipeline was causing a
cross-product on all data flow nodes in a WebApp. It's typically
better to apply the other conditions (e.g. name matching), then
filter on the same web-app as a final step.
Avoid creating a cross-product of tokens identified in a string.

Instead, create an ordering of tokens based on begin/end locations
and process those linearly to find the first/next/contains tokens.
@lcartey lcartey requested a review from jeongsoolee09 April 2, 2025 11:07
Copy link
Contributor

@jeongsoolee09 jeongsoolee09 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a question, but otherwise I think it's a brilliant solution to work around the joining issue!

 - Consider all tokens beginning at the same location as eligible
   to be the next token.
 - Implement strictContains to reflect previous behaviour.
@lcartey
Copy link
Contributor Author

lcartey commented Apr 2, 2025

I'm not very sure if getNextAfterEnd should be used in place of getNext. t.getNextAfterEnd finds a token which may come after a whitespace token that this token may contain, so this may give bar when token t's value is "foo bar", whereas the expected behavior of getNextSkippingWhitespace is to get the token that comes after the entire "foo bar".

@jeongsoolee09 Resurrecting your question, as it got lost.

getNextAfterEnd(..) should return the same set as the previous implementation of getNext(..). In both cases, it will return tokens which begin the character after the previous token ends.

The new implementation of getNext reports the next token according to the beginning position of the token. This will mean it reports tokens that are contained within the current token. That's not what we want for getNextSkippingWhitespace - we want to match the existing behaviour.


Note, I've also updated to address some minor inconsistencies with the previous implementation.

Copy link
Contributor

@jeongsoolee09 jeongsoolee09 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense now and completely understood. Thank you!

@jeongsoolee09 jeongsoolee09 merged commit 3886d28 into main Apr 3, 2025
5 checks passed
@jeongsoolee09 jeongsoolee09 deleted the lcartey/chunk-js-performance branch April 3, 2025 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants