Skip to content

Hessian Trace Estimate Calculation exits prematurely due to negative value of diff_avg  #2155

Closed
@karam-nus

Description

Trying out automatic mixed precision QAT using mobilenetv2 example.

nncf/torch/quantization/hessian_trace.py#L144C3-L144C3

            diff_avg = abs(mean_avg_total_trace - avg_total_trace) / (avg_total_trace + self._diff_eps)
            if diff_avg < tolerance:
                return mean_avg_traces_per_param

Suppose the diff_avg comes out to be negative in the initial stages of trace estimation, then the traces estimated will not be correct as the algorithm didn't run for expected number of iterations for convergence using Hutchinson Method.

E.g.
Tolerance = 1e-06
diff_avg values w.r.t. iteration number
"diff_avg": {
"1": 26366.833984375,
"2": 0.1541769653558731,
"3": 0.28552359342575073,
"4": 0.783305823802948,
"5": 4.321332931518555,
"6": -0.36715415120124817
},

A correct approach is to use absolute value of avg_total_trace in denominator, as done in PyHessian:

if abs(np.mean(trace_vhv) - trace) / (abs(trace) + 1e-6) < tol:

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions