Skip to content

Some MuTect files are improperly joined #77

@lbeltrame

Description

@lbeltrame

When testing the pipeline after fixing the cluster problem, I noticed that the GATK errored out much later on with this:

The provided VCF file is malformed at approximately line number 4: The VCF specification does not allow for whitespace in the INFO field

Looking at the VCF file, I noticed the following:

##fileformat=VCFv4.1
## No variants; no reads aligned in region
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO
chr10   89685461        .       G       T       .       REJECT  AC=0;AF=0.00;AN=1;DP=456        GT:AD:DP:FA     0:454,1:456:2.198e-03

As you can see, it's an empty VCF that's however been "filled" somehow. The reason is that MuTect manages to write data before erroring out (with that famous Java 7 error we discussed), therefore generating a wrong VCF file.

I'm not sure there's an easy solution for this, given that the GATK is now requiring Java 7 and MuTect does not work yet properly with it.

On the plus side, this gives me more motivation to implement support for other paired callers. ;)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions