-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Const loop partition fix #3966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Const loop partition fix #3966
Conversation
Bounds check can return interval neg_inf, 0 for conditions such as (1*128 + iter_var*8 + 7) < 130. The bound it finds is (122 - 128)/6. Here iter_var should not be able to take on value 0 since with that we are already violating the condition. But since bound check does div, which by default rounds to zero, we get zero. This diff fixes this for const int bounds when before division it is knowns that divisor and dividend are constants and if so it does floordiv, instead of div. This seems ok for condition on iteration variables, however not clear if it should do floordiv everywhere else.
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
|
@tqchen. Can you please review this? I am not intimately familiar with entire codebase so not sure what other parts this may impact. Thanks! |
bound. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
yinghai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tqchen The issue is legit. But I don't understand the code comments here: https://github.com/dmlc/tvm/blob/f3abb3d88f3a1145a4454649b454fe0139f19bc9/src/arithmetic/bound_deducer.cc#L176-L183. Generally, why do we need a trick to consider target_is_non_neg instead of
if (sign_operand == kPositive) {
// do nothing
} else {
result_ -= 1;
}
?
|
I see because you don't actual know the sign of |
|
We are hitting exactly this problem: https://github.com/dmlc/tvm/blob/f3abb3d88f3a1145a4454649b454fe0139f19bc9/src/arithmetic/bound_deducer.cc#L175 https://gist.github.com/yinghai/b51d4c6322fe6406dd1f3ab54b6905b0 is an alternative fix in the same spirit of this PR. In general I'm not sure how to fix this in the user side (loop_partition.cc). |
|
Thanks for the PR, I will take a look this week. As a side note, I am pushing change to enable codegen for floordiv directly, which might simplify the logic of this PR, will send note if i manage to get something in this week. |
|
I am not sure if your changes to do codegen for floordiv will relate here. But, in this PR and in the other approach yinghai mentioned we do floordiv under specific conditions when the expression does not involve variable. If you have some expression like (n - 1)/3, I doubt we can replace this with floordiv because that changes the semantics of div. This expression in pure c will actually do div with 'round towards zero' from what I understand (correct me if I am wrong). |
|
superceded by #4025 thanks @kimishpatel @yinghai |
Partitioning was failing for the test added in this PR:
Originally the generated output is:
produce compute {
for (i0.outer.inner, 0, 16) {
compute[(i0.outer.inner8)] = max(placeholder[(i0.outer.inner8)], 0f)
compute[((i0.outer.inner8) + 1)] = max(placeholder[((i0.outer.inner8) + 1)], 0f)
compute[((i0.outer.inner8) + 2)] = max(placeholder[((i0.outer.inner8) + 2)], 0f)
compute[((i0.outer.inner8) + 3)] = max(placeholder[((i0.outer.inner8) + 3)], 0f)
compute[((i0.outer.inner8) + 4)] = max(placeholder[((i0.outer.inner8) + 4)], 0f)
compute[((i0.outer.inner8) + 5)] = max(placeholder[((i0.outer.inner8) + 5)], 0f)
compute[((i0.outer.inner8) + 6)] = max(placeholder[((i0.outer.inner8) + 6)], 0f)
compute[((i0.outer.inner8) + 7)] = max(placeholder[((i0.outer.inner8) + 7)], 0f)
}
compute[128] = max(placeholder[128], 0f)
compute[129] = max(placeholder[129], 0f)
compute[130] = max(placeholder[130], 0f)
compute[131] = max(placeholder[131], 0f)
compute[132] = max(placeholder[132], 0f)
compute[133] = max(placeholder[133], 0f)
compute[134] = max(placeholder[134], 0f)
compute[135] = max(placeholder[135], 0f)
}
Whereas it should be:
produce compute {
for (i0.outer.inner, 0, 16) {
compute[(i0.outer.inner8)] = max(placeholder[(i0.outer.inner8)], 0f)
compute[((i0.outer.inner8) + 1)] = max(placeholder[((i0.outer.inner8) + 1)], 0f)
compute[((i0.outer.inner8) + 2)] = max(placeholder[((i0.outer.inner8) + 2)], 0f)
compute[((i0.outer.inner8) + 3)] = max(placeholder[((i0.outer.inner8) + 3)], 0f)
compute[((i0.outer.inner8) + 4)] = max(placeholder[((i0.outer.inner8) + 4)], 0f)
compute[((i0.outer.inner8) + 5)] = max(placeholder[((i0.outer.inner8) + 5)], 0f)
compute[((i0.outer.inner8) + 6)] = max(placeholder[((i0.outer.inner8) + 6)], 0f)
compute[((i0.outer.inner8) + 7)] = max(placeholder[((i0.outer.inner8) + 7)], 0f)
}
compute[128] = max(placeholder[128], 0f)
compute[129] = max(placeholder[129], 0f)
}
Bounds check can return interval neg_inf, 0 for conditions such as
(1128 + iter_var*8 + 7) < 130. The bound it finds is
(122 - 128)/6. Here iter_var should not be able to take on value 0 since
with that we are already violating the condition. But since bound check
does div, which by default rounds to zero, we get zero.
This diff fixes this for const int bounds when before division it is
knowns that divisor and dividend are constants and if so it does
floordiv, instead of div. This seems ok for condition on iteration
variables, however not clear if it should do floordiv everywhere else.