Model violation statistics in the low-data regime

Creating this issue at the suggestion of @coreyostrove. CC @pcwysoc.

GST reports define model violation as $N_{\sigma} = (q - k) / \sqrt{2k}$, where $q$ is a $\chi_k^2$-distributed quantity derived from the model under test and the dataset used in model fitting. Normally, $k = n_d - n_p$ is the difference between the number of independent parameters in the dataset and the number of model parameters.

This method for computing model violation doesn't work when $n_d < n_p$, which can happen in early iterations of long-sequence GST with unusual experiment designs. For example, we've seen this with fitting a qutrit model to a qubit experiment design, and when using extremely aggressive fiducial pair reduction in multi-qubit GST. In these situations our reports evaluate $(q - k) / \sqrt{2k}$ at $k=1$, even though $q$ is not necessarily $\chi^2_1$-distributed.

We should figure out how to report model violation in these problematic situations. One option is to just not report model violations for GST depths that have $n_d < n_p$. 

Anyone who has thoughts on this subject should feel free to chime in!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Model violation statistics in the low-data regime #633

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model violation statistics in the low-data regime #633

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions