-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Interval::is_single_point check for deep equality #8202
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Deletes a bunch of code and speeds up lowering time of local laplacian with 20 pyramid levels by ~2.5%
It was O(n) for n facts. This makes it O(log(n)) This was particularly bad for pipelines with lots of inputs or outputs, because those pipelines have lots of asserts, which make for lots of facts to substitute in. Speeds up lowering of local laplacian with 20 pyramid levels (which has only one input and one output) by 1.09x Speeds up lowering of the adams 2019 cost model training pipeline (lots of weight inputs and lots outputs due to derivatives) by 1.5x Speeds up resnet50 (tons of weight inputs) lowering by 7.3x!
…o abadams/faster_substitute_facts
…o abadams/faster_substitute_facts
…o abadams/faster_substitute_facts
Interval::is_single_point() used to only compare expressions by shallow equality to see if they are the same Expr object. However, bounds_of_expr_in_scope is really improved if it uses deep equality instead, so it has a prepass that goes over the provided scope, calls equal(min, max) on everything, and fixes up anything where deep equality is true but shallow equality. This prepass costs O(n) for n things in scope, regardless of how complex the expression being analyzed is. So if you ask for the bounds of '4' say in a context where there are lots of things in the scope, it's absurdly slow. We were doing this! BoxTouched calls bounds_of_expr_in_scope lots of times on small index Exprs within the same very large scope. It's better to just make Interval::is_single_point() check deep equality. This speeds up local laplacian lowering by 1.1x, and resnet50 lowering by 1.5x. There were also places where intervals that were a single point were diverging due to carelessly written code. E.g. the interval [40*8, 40*8], where both of those 40*8s are the same Mul node, was being simplified like this: interval.min = simplify(interval.min); interval.max = simplify(interval.max); Not only does this do double the simplification work it should, but it also caused something that was a single point to diverge into not being a single point, because the repeated constant-folding creates a new Expr. With the new is_single_point this matters a lot less, but even so, I centralized simplification of intervals into a single helper that doesn't do the pointless double-simplification for single points. Some of these shallowly-unequal but deeply-equal Intervals were being created in bounds inference itself after the prepass, which may have been generating suboptimal bounds. This change should fix that in addition to the compile-time benefits. Also added a simplify call in SkipStages because I noticed when it processed specializations it was creating things like (condition) || (!condition).
Ready for review now that #8200 is merged |
rootjalex
approved these changes
Apr 21, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm (and also apologies, I'm pretty sure I'm the one who added the code to clean up scope).
1 task
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Interval::is_single_point() used to only compare expressions by shallow
equality to see if they are the same Expr object.
However, bounds_of_expr_in_scope is really improved if it uses deep
equality instead, so it has a prepass that goes over the provided scope,
calls equal(min, max) on everything, and fixes up anything where deep
equality is true but shallow equality is not.
This prepass costs O(n) for n things in scope, regardless of how complex
the expression being analyzed is. So if you ask for the bounds of
4
say in a context where there are lots of things in the scope, it's
absurdly slow. We were doing this! BoxTouched calls
bounds_of_expr_in_scope lots of times on small index Exprs within the
same very large scope.
It's better to just make Interval::is_single_point() check deep
equality. This speeds up local laplacian lowering by 1.1x, and resnet50
lowering by 1.5x.
There were also places where intervals that were a single point were
diverging due to carelessly written code. E.g. the interval
[40*8, 40*8]
,where both of those 40*8s are the same Mul node, was being
simplified like this:
Not only does this do double the simplification work it should, but it
also caused something that was a single point to diverge into not being
a single point, because the repeated constant-folding creates a new
Expr. With the new is_single_point this matters a lot less, but even so,
I centralized simplification of intervals into a single helper that
doesn't do the pointless double-simplification for single points.
Some of these shallowly-unequal but deeply-equal Intervals were being
created in bounds inference itself after the prepass, which may have
been generating suboptimal bounds. This change should fix that in
addition to the compile-time benefits.
Also added a simplify call in SkipStages because I noticed when it
processed specializations it was creating things like (condition) ||
(!condition).
Built on #8200