Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix][Ansor] Fixing Ansor Gradient Bug #16739

Merged
merged 3 commits into from
Apr 1, 2024

Conversation

thaisacs
Copy link
Contributor

@thaisacs thaisacs commented Mar 19, 2024

When Ansor does not find a schedule for a task in warm up, Ansor gradient gets stuck in this task, because there is no optimized schedule for this task. Hence, Ansor does not optimize any task.

Behavior before correction

bug

Behavior after correction

bug1

@cbalint13
Copy link
Contributor

Cc @comaniac , @merrymercy , @vinx13 , @jcf94

@comaniac
Copy link
Contributor

This change looks a bit hacky and unsafe as you override the cost before calculating the gradients and recover it back afterwards. Intuitively if a task has no schedules, should we just mark it as a dead task and never consider it in the rest of the process?

Comment on lines 363 to 365
for task_idx in range(len(self.tasks)):
if(self.best_costs[task_idx] == 1e10):
self.dead_tasks.add(task_idx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for task_idx in range(len(self.tasks)):
if(self.best_costs[task_idx] == 1e10):
self.dead_tasks.add(task_idx)
for task_idx, cost in enumerate(self.best_costs):
if cost == 1e10:
self.dead_tasks.add(task_idx)

@@ -358,6 +358,12 @@ def tune(
self.best_ct = self.ct
self.best_score = self.cur_score

# put task without schedule on warm up to dead state
if self.strategy == "gradient":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this logic applies to all strategies instead of just gradient, so this condition may not be necessary. Could you help confirm?

Copy link
Contributor

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cbalint13
Copy link
Contributor

@thaisacs

Could please git force push to trigger tests from scratch, some tests failed for no reason.
I tried "@tvm-bot rerun" for you, but one particular (the hexagon) test still does not wanting to restart.

@comaniac
Copy link
Contributor

@tvm-bot rerun

@comaniac comaniac merged commit ffa9cfd into apache:main Apr 1, 2024
18 checks passed
@comaniac
Copy link
Contributor

comaniac commented Apr 1, 2024

Thanks @thaisacs

thaisacs added a commit to thaisacs/tvm that referenced this pull request Apr 3, 2024
* Fixing ansor gradient bug

* Changing to dead_task

* Applying reviews
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants