HG002 results suspiciously bad

I re-aligned the [HG002 30x PCR-free WGS BAM from Baid et al](https://console.cloud.google.com/storage/browser/brain-genomics-public/research/sequencing/grch38/bam/novaseq/wgs_pcr_free/30x) against hs37d5 using DRAGEN.

I then used the BAM with WHAM to call SVs:

```
whamg -a hs37d5.fa -f HG002.bam -x 8 \
  | vcf-sort -c \
  | uniq \
  | bcftools norm -N -m-any -O z --write-index=tbi -o HG002.wham.vcf.gz
```

I then benchmarked against the GIAB v0.6 high-confidence callset using [Witty.er](https://github.com/Illumina/witty.er):

```
docker run --rm -v $(pwd):/data -w /data wittyer \
  -i HG002.wham.vcf.gz \
  -t HG002_SVs_Tier1_v0.6.vcf.gz \
  -b HG002_SVs_Tier1_v0.6.bed \
  -o HG002.wham \
  --em SimpleCounting \
  --if PASS
```

The F1 score at the event level is 0.01 and at the base level is 0.14. I suspect I'm doing something wrong, but I can't figure out what. I've used the same process for benchmarking other SV callers and it works fine. Given the author of Wittyer, is it somehow biased against WHAM :)? Is there a different comparison tool and/or callset I should be using for evaluation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HG002 results suspiciously bad #64

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

HG002 results suspiciously bad #64

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions