Skip to content

Extract a list of mean values from format tags (e.g., DP, GC, IGC, etc.) per position #2271

@andreasgoteson

Description

@andreasgoteson

Hi,

I have a question (or a feature request): bcftools allows the use of functions on format tags to filter data, but is it possible to just return the result of functions for each position?

For example, I have a multisample vcf with format fields containing per sample information such as Illumina GenCall Confidence Score (IGC)*. I can successfully filter the vcf based on mean IGC (per position), e.g., bcftools view -i 'AVG(FMT/IGC)>0.7' my.vcf. This code correctly returns a filtered vcf retaining only positions where the mean IGC was above 0.7.

But I want to extract a list of all mean values and standard deviations for IGC per position. Is that possible using bcftools?

The output is supposed to look like:
CHROM POS ID mean_IGC sd_IGC
1 1 rs123 0.8 0.05
1 2 rs234 0.5 0.2
...etc

Thanks a lot!

*The IGC is just a score ranging between 0-1 for the genotyping call confidence that is unique for each sample and position.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions