Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong results with rfactor #7890

Closed
TH3CHARLie opened this issue Oct 10, 2023 · 1 comment · Fixed by #8086
Closed

Wrong results with rfactor #7890

TH3CHARLie opened this issue Oct 10, 2023 · 1 comment · Fixed by #8086
Assignees
Labels

Comments

@TH3CHARLie
Copy link
Contributor

A repro

#include "Halide.h"
#include <iostream>
using namespace Halide;

int main() {
    Func input("input");
    Func local_sum("local_sum");
    Func blurry("blurry");
    Var x("x"), y("y");
    RVar yryf;
    input(x, y) = 2 * x + 5 * y;
    RDom r(-2, 5, -2, 5, "rdom_r");
    local_sum(x, y) = 0;
    local_sum(x, y) += input(x + r.x, y + r.y);
    blurry(x, y) = cast<int32_t>(local_sum(x, y) / 25);
    Buffer<int> buf1, buf2;
    {
        Pipeline p({blurry});
        buf1 = p.realize({128, 128});
    }
    {
        Var yo, yi, xo, xi, u;
        blurry.split(y, yo, yi, 2, TailStrategy::Auto);
        local_sum.split(x, xo, xi, 4, TailStrategy::Auto);
        local_sum.update(0).split(x, xo, xi, 1, TailStrategy::Auto);
        local_sum.update(0).rfactor(r.x, u);
        blurry.store_root();
        local_sum.compute_root();
        Pipeline p({blurry});
        buf2 = p.realize({128, 128});
    }
    for (int i = 0; i < 128; ++i) {
        for (int j = 0; j < 128; ++j) {
            if (buf1(i, j) != buf2(i, j)) {
                std::cout << "Error incorrect result at i: " << i << " j: " << j << " expected: " << buf1(i, j) << " actual: " << buf2(i, j) << std::endl;
            }
        }
    }
    return 0;
}
@abadams
Copy link
Member

abadams commented Oct 10, 2023

Looks like an rfactor bug. I think there's an implicit assumption somewhere in the rfactor code that pure dims haven't been scheduled yet. I'm seeing this sort of thing:

 for (local_sum$1.s1.x, local_sum$1.s1.x.loop_min, local_sum$1.s1.x.loop_extent) {
   for (local_sum$1.s1.y, local_sum$1.s1.y.loop_min, local_sum$1.s1.y.loop_extent) {
    for (local_sum$1.s1.x.xo, local_sum$1.s1.x.xo.loop_min, local_sum$1.s1.x.xo.loop_extent) {
     let local_sum$1.s1.x.xi.base = (local_sum$1.s1.x.xo*1) + local_sum$1.s1.x.loop_min
     for (local_sum$1.s1.x.xi, local_sum$1.s1.x.xi.loop_min, local_sum$1.s1.x.xi.loop_extent) {
      let local_sum$1.s1.x = local_sum$1.s1.x.xi.base + local_sum$1.s1.x.xi

There shouldn't be a loop over local_sum$1.s1.x - it has been split. The LetStmt is supposed to be what defines local_sum$1.s1.x

@abadams abadams added the bug label Oct 10, 2023
@abadams abadams self-assigned this Oct 10, 2023
abadams added a commit that referenced this issue Feb 9, 2024
When you rfactor an update definition, the new update definition must
use all the pure vars of the Func, even though the one you're rfactoring
may not have used them all.

We also want to preserve any scheduling already done to the pure vars,
so we want to preserve the dims list and splits list from the original
definition.

The code accounted for this by checking the dims list for any missing
pure vars and adding them at the end (just before Var::outermost()), but
this didn't account for the fact that they may no longer exist in the
dims list due to splits that didn't reuse the outer name. In these
circumstances we could end up with too many pure loops. E.g. if x has
been split into xo and xi, then the code was adding a loop for x even
though there were already loops for xo and xi, which of course produces
garbage output.

This PR instead just checks which pure vars are actually used in the
update definition up front, and then uses that to tell which ones should
be added.

Fixes #7890
abadams added a commit that referenced this issue Feb 12, 2024
When you rfactor an update definition, the new update definition must
use all the pure vars of the Func, even though the one you're rfactoring
may not have used them all.

We also want to preserve any scheduling already done to the pure vars,
so we want to preserve the dims list and splits list from the original
definition.

The code accounted for this by checking the dims list for any missing
pure vars and adding them at the end (just before Var::outermost()), but
this didn't account for the fact that they may no longer exist in the
dims list due to splits that didn't reuse the outer name. In these
circumstances we could end up with too many pure loops. E.g. if x has
been split into xo and xi, then the code was adding a loop for x even
though there were already loops for xo and xi, which of course produces
garbage output.

This PR instead just checks which pure vars are actually used in the
update definition up front, and then uses that to tell which ones should
be added.

Fixes #7890
abadams added a commit that referenced this issue Feb 19, 2024
When you rfactor an update definition, the new update definition must
use all the pure vars of the Func, even though the one you're rfactoring
may not have used them all.

We also want to preserve any scheduling already done to the pure vars,
so we want to preserve the dims list and splits list from the original
definition.

The code accounted for this by checking the dims list for any missing
pure vars and adding them at the end (just before Var::outermost()), but
this didn't account for the fact that they may no longer exist in the
dims list due to splits that didn't reuse the outer name. In these
circumstances we could end up with too many pure loops. E.g. if x has
been split into xo and xi, then the code was adding a loop for x even
though there were already loops for xo and xi, which of course produces
garbage output.

This PR instead just checks which pure vars are actually used in the
update definition up front, and then uses that to tell which ones should
be added.

Fixes #7890
steven-johnson pushed a commit that referenced this issue Feb 19, 2024
* Fix rfactor adding too many pure loops (#8086)

When you rfactor an update definition, the new update definition must
use all the pure vars of the Func, even though the one you're rfactoring
may not have used them all.

We also want to preserve any scheduling already done to the pure vars,
so we want to preserve the dims list and splits list from the original
definition.

The code accounted for this by checking the dims list for any missing
pure vars and adding them at the end (just before Var::outermost()), but
this didn't account for the fact that they may no longer exist in the
dims list due to splits that didn't reuse the outer name. In these
circumstances we could end up with too many pure loops. E.g. if x has
been split into xo and xi, then the code was adding a loop for x even
though there were already loops for xo and xi, which of course produces
garbage output.

This PR instead just checks which pure vars are actually used in the
update definition up front, and then uses that to tell which ones should
be added.

Fixes #7890

* Forward the partition methods from generator outputs (#8090)

* Fix reduce_expr_modulo of vector in Solve.cpp (#8089)

* Fix reduce_expr_modulo of vector in Solve.cpp

* Fix test
ardier pushed a commit to ardier/Halide-mutation that referenced this issue Mar 3, 2024
When you rfactor an update definition, the new update definition must
use all the pure vars of the Func, even though the one you're rfactoring
may not have used them all.

We also want to preserve any scheduling already done to the pure vars,
so we want to preserve the dims list and splits list from the original
definition.

The code accounted for this by checking the dims list for any missing
pure vars and adding them at the end (just before Var::outermost()), but
this didn't account for the fact that they may no longer exist in the
dims list due to splits that didn't reuse the outer name. In these
circumstances we could end up with too many pure loops. E.g. if x has
been split into xo and xi, then the code was adding a loop for x even
though there were already loops for xo and xi, which of course produces
garbage output.

This PR instead just checks which pure vars are actually used in the
update definition up front, and then uses that to tell which ones should
be added.

Fixes halide#7890
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants