[BugFix][Ansor] Fixing Ansor Gradient Bug #16739

thaisacs · 2024-03-19T03:48:55Z

When Ansor does not find a schedule for a task in warm up, Ansor gradient gets stuck in this task, because there is no optimized schedule for this task. Hence, Ansor does not optimize any task.

Behavior before correction

Behavior after correction

cbalint13 · 2024-03-28T15:02:37Z

Cc @comaniac , @merrymercy , @vinx13 , @jcf94

comaniac · 2024-03-28T15:23:42Z

This change looks a bit hacky and unsafe as you override the cost before calculating the gradients and recover it back afterwards. Intuitively if a task has no schedules, should we just mark it as a dead task and never consider it in the rest of the process?

comaniac · 2024-03-29T01:06:33Z

python/tvm/auto_scheduler/task_scheduler.py

+            for task_idx in range(len(self.tasks)):
+                if(self.best_costs[task_idx] == 1e10):
+                    self.dead_tasks.add(task_idx)


Suggested change

for task_idx in range(len(self.tasks)):

if(self.best_costs[task_idx] == 1e10):

self.dead_tasks.add(task_idx)

for task_idx, cost in enumerate(self.best_costs):

if cost == 1e10:

self.dead_tasks.add(task_idx)

comaniac · 2024-03-29T01:08:39Z

python/tvm/auto_scheduler/task_scheduler.py

@@ -358,6 +358,12 @@ def tune(
        self.best_ct = self.ct
        self.best_score = self.cur_score

+        # put task without schedule on warm up to dead state
+        if self.strategy == "gradient":


I feel this logic applies to all strategies instead of just gradient, so this condition may not be necessary. Could you help confirm?

comaniac

LGTM

cbalint13 · 2024-03-29T11:06:26Z

@thaisacs

Could please git force push to trigger tests from scratch, some tests failed for no reason.
I tried "@tvm-bot rerun" for you, but one particular (the hexagon) test still does not wanting to restart.

comaniac · 2024-03-31T03:53:15Z

@tvm-bot rerun

comaniac · 2024-04-01T07:37:43Z

Thanks @thaisacs

* Fixing ansor gradient bug * Changing to dead_task * Applying reviews

Fixing ansor gradient bug

fcf8e03

Changing to dead_task

7b1a19c

thaisacs force-pushed the ansor-gradient-bug branch from 3a0ea0e to 7b1a19c Compare March 29, 2024 01:01

comaniac reviewed Mar 29, 2024

View reviewed changes

comaniac approved these changes Mar 29, 2024

View reviewed changes

thaisacs force-pushed the ansor-gradient-bug branch from bc003bf to c472a87 Compare March 29, 2024 03:04

Applying reviews

6c1c39a

thaisacs force-pushed the ansor-gradient-bug branch from c472a87 to 6c1c39a Compare March 29, 2024 13:51

comaniac merged commit ffa9cfd into apache:main Apr 1, 2024
18 checks passed

thaisacs added a commit to thaisacs/tvm that referenced this pull request Apr 3, 2024

[BugFix][Ansor] Fixing Ansor Gradient Bug (apache#16739)

a11b5db

* Fixing ansor gradient bug * Changing to dead_task * Applying reviews

ysh329 mentioned this pull request Apr 21, 2024

[Release] v0.16.0 Release Candidate Notes #16911

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix][Ansor] Fixing Ansor Gradient Bug #16739

[BugFix][Ansor] Fixing Ansor Gradient Bug #16739

thaisacs commented Mar 19, 2024 •

edited

Loading

cbalint13 commented Mar 28, 2024

comaniac commented Mar 28, 2024

comaniac Mar 29, 2024

comaniac Mar 29, 2024

comaniac left a comment

cbalint13 commented Mar 29, 2024

comaniac commented Mar 31, 2024

comaniac commented Apr 1, 2024

[BugFix][Ansor] Fixing Ansor Gradient Bug #16739

[BugFix][Ansor] Fixing Ansor Gradient Bug #16739

Conversation

thaisacs commented Mar 19, 2024 • edited Loading

cbalint13 commented Mar 28, 2024

comaniac commented Mar 28, 2024

comaniac Mar 29, 2024

Choose a reason for hiding this comment

comaniac Mar 29, 2024

Choose a reason for hiding this comment

comaniac left a comment

Choose a reason for hiding this comment

cbalint13 commented Mar 29, 2024

comaniac commented Mar 31, 2024

comaniac commented Apr 1, 2024

thaisacs commented Mar 19, 2024 •

edited

Loading