Skip to content

Conversation

@kimishpatel
Copy link
Contributor

Partitioning was failing for the test added in this PR:

Originally the generated output is:
produce compute {
for (i0.outer.inner, 0, 16) {
compute[(i0.outer.inner8)] = max(placeholder[(i0.outer.inner8)], 0f)
compute[((i0.outer.inner8) + 1)] = max(placeholder[((i0.outer.inner8) + 1)], 0f)
compute[((i0.outer.inner8) + 2)] = max(placeholder[((i0.outer.inner8) + 2)], 0f)
compute[((i0.outer.inner8) + 3)] = max(placeholder[((i0.outer.inner8) + 3)], 0f)
compute[((i0.outer.inner8) + 4)] = max(placeholder[((i0.outer.inner8) + 4)], 0f)
compute[((i0.outer.inner8) + 5)] = max(placeholder[((i0.outer.inner8) + 5)], 0f)
compute[((i0.outer.inner8) + 6)] = max(placeholder[((i0.outer.inner8) + 6)], 0f)
compute[((i0.outer.inner8) + 7)] = max(placeholder[((i0.outer.inner8) + 7)], 0f)
}
compute[128] = max(placeholder[128], 0f)
compute[129] = max(placeholder[129], 0f)
compute[130] = max(placeholder[130], 0f)
compute[131] = max(placeholder[131], 0f)
compute[132] = max(placeholder[132], 0f)
compute[133] = max(placeholder[133], 0f)
compute[134] = max(placeholder[134], 0f)
compute[135] = max(placeholder[135], 0f)
}

Whereas it should be:

produce compute {
for (i0.outer.inner, 0, 16) {
compute[(i0.outer.inner8)] = max(placeholder[(i0.outer.inner8)], 0f)
compute[((i0.outer.inner8) + 1)] = max(placeholder[((i0.outer.inner8) + 1)], 0f)
compute[((i0.outer.inner8) + 2)] = max(placeholder[((i0.outer.inner8) + 2)], 0f)
compute[((i0.outer.inner8) + 3)] = max(placeholder[((i0.outer.inner8) + 3)], 0f)
compute[((i0.outer.inner8) + 4)] = max(placeholder[((i0.outer.inner8) + 4)], 0f)
compute[((i0.outer.inner8) + 5)] = max(placeholder[((i0.outer.inner8) + 5)], 0f)
compute[((i0.outer.inner8) + 6)] = max(placeholder[((i0.outer.inner8) + 6)], 0f)
compute[((i0.outer.inner8) + 7)] = max(placeholder[((i0.outer.inner8) + 7)], 0f)
}
compute[128] = max(placeholder[128], 0f)
compute[129] = max(placeholder[129], 0f)
}

Bounds check can return interval neg_inf, 0 for conditions such as
(1128 + iter_var*8 + 7) < 130. The bound it finds is
(122 - 128)/6. Here iter_var should not be able to take on value 0 since
with that we are already violating the condition. But since bound check
does div, which by default rounds to zero, we get zero.
This diff fixes this for const int bounds when before division it is
knowns that divisor and dividend are constants and if so it does
floordiv, instead of div. This seems ok for condition on iteration
variables, however not clear if it should do floordiv everywhere else.

Bounds check can return interval neg_inf, 0 for conditions such as
(1*128 + iter_var*8 + 7) < 130. The bound it finds is
(122 - 128)/6. Here iter_var should not be able to take on value 0 since
with that we are already violating the condition. But since bound check
does div, which by default rounds to zero, we get zero.
This diff fixes this for const int bounds when before division it is
knowns that divisor and dividend are constants and if so it does
floordiv, instead of div. This seems ok for condition on iteration
variables, however not clear if it should do floordiv everywhere else.
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
@kimishpatel
Copy link
Contributor Author

@tqchen. Can you please review this? I am not intimately familiar with entire codebase so not sure what other parts this may impact. Thanks!

bound.

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Copy link

@yinghai yinghai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tqchen The issue is legit. But I don't understand the code comments here: https://github.com/dmlc/tvm/blob/f3abb3d88f3a1145a4454649b454fe0139f19bc9/src/arithmetic/bound_deducer.cc#L176-L183. Generally, why do we need a trick to consider target_is_non_neg instead of

if (sign_operand == kPositive) {
    // do nothing
} else {
    result_ -= 1;
}

?

@yinghai
Copy link

yinghai commented Sep 18, 2019

I see because you don't actual know the sign of result_ at that time.

@yinghai
Copy link

yinghai commented Sep 19, 2019

We are hitting exactly this problem: https://github.com/dmlc/tvm/blob/f3abb3d88f3a1145a4454649b454fe0139f19bc9/src/arithmetic/bound_deducer.cc#L175

https://gist.github.com/yinghai/b51d4c6322fe6406dd1f3ab54b6905b0 is an alternative fix in the same spirit of this PR. In general I'm not sure how to fix this in the user side (loop_partition.cc).

@tqchen
Copy link
Member

tqchen commented Sep 19, 2019

Thanks for the PR, I will take a look this week. As a side note, I am pushing change to enable codegen for floordiv directly, which might simplify the logic of this PR, will send note if i manage to get something in this week.

@kimishpatel
Copy link
Contributor Author

I am not sure if your changes to do codegen for floordiv will relate here. But, in this PR and in the other approach yinghai mentioned we do floordiv under specific conditions when the expression does not involve variable. If you have some expression like (n - 1)/3, I doubt we can replace this with floordiv because that changes the semantics of div. This expression in pure c will actually do div with 'round towards zero' from what I understand (correct me if I am wrong).
So just wanted to point this out just in case.

@umangyadav
Copy link
Contributor

@kimishpatel @yinghai
#4025

@tqchen
Copy link
Member

tqchen commented Nov 15, 2019

superceded by #4025 thanks @kimishpatel @yinghai

@tqchen tqchen closed this Nov 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants