Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mergeCavemanResults #57

Closed
ramaniak opened this issue Dec 5, 2016 · 8 comments
Closed

mergeCavemanResults #57

ramaniak opened this issue Dec 5, 2016 · 8 comments

Comments

@ramaniak
Copy link

ramaniak commented Dec 5, 2016

Hello,
I did not find any documentation for this script in the main documentation page, but did run across it when searching the repository. Is this script meant to be run after the estep to combine all the variant calls?

I did try running it as follows:

mergeCavemanResults --output cavemen_muts_all_merge.vcf -s splitList -f results/%/%.muts/vcf
and get the following error

Expected 1726 files but got 0. at ./mergeCavemanResults line 54.

Could you please let me know if this is meant to be run this way? If not, is there a tool you would recommend for combining the final variant calls?

thanks
Arun

@ramaniak
Copy link
Author

ramaniak commented Dec 5, 2016

Hello,
I am also seeing some strange calls as below where it is reporting the same ref and alt read at many locations.

10	126678241	.	G	G	.	.	DP=1132;MP=1.0e+00;GP=2.3e-30;TG=GG/GGGGG;TP=7.1e-04;SG=GG/AGGGG;SP=1.0e+00	GT:FAZ:FCZ:FGZ:FTZ:RAZ:RCZ:RGZ:RTZ:PM	0|0:0:2:162:0:0:4:129:0:9.8e-01	0|0:0:5:443:0:0:13:374:0:9.8e-01

@keiranmraine
Copy link
Contributor

Hi Arun,

We'd recommend using the wrapper code to make your life easier and especially for your first run as it's easier to determine if issues are data rather than execution using this:

https://github.com/cancerit/cgpCaVEManWrapper

It can be run in a farm compatible manner.

Regards,
Keiran

@ramaniak
Copy link
Author

ramaniak commented Dec 6, 2016 via email

@ramaniak
Copy link
Author

ramaniak commented Dec 6, 2016

Hello Keiran,
I am not sure if you would like me to continue this thread or start a new issue at cgpCaVEManWrapper. I can move it if that is preferable.

There are two flags that are stated as "required" for which I am not sure what the preferred input is:

  1. -germline-indel
  2. -unmatched-vcf

I am working with exomes and is there a particular caller you recommend for getting germline indels? Also, I am not sure what is expected under unmatched vcf and what it means by "Directory containing unmatched normal VCF files or http/ftp base". Is it possible to work without the indel call or unmatched-vcfs? I was not able to get much information from the documentation.

thanks
Arun

@ghost
Copy link

ghost commented Dec 7, 2016

@ramaniak

  1. -germline-indel Is a file produced by cgpPindel used in flagging the CaVEMan calls
  2. -unmatched-vcf A vcf panel of unmatched normals (again used in flagging). A description of how to produce this file can be found in the upcoming Current Protocols in Bioinformatics paper (unfortunately not yet published).

For now you can avoid these by using the -no-flagging parameter and providing the following:

  1. -germline-indel - A bed file containing the single false bed entry 1 0 1
  2. -unmatched-vcf - Any directory

This has highlighted an issue with caveman.pl that when requesting no flagging, these files would still be required. I'll open an issue in cgpCaVEManWrapper to ensure this gets fixed.

Dave

@ramaniak
Copy link
Author

ramaniak commented Dec 7, 2016 via email

@ramaniak
Copy link
Author

Hello,
The wrapper script runs through the whole pipeline and generates the merged muts and snps file as expected.

Unfortunately, I am seeing a few strange calls in the final vcf file.

10      115370307       e848aa14-bd83-11e6-9994-dbef58c5c9a6    C       C       .       .       DP=863;MP=1.0e+00;GP=2.9e-17;TG=CG/CC;TP=2.9e-17;SG=CC/CT;SP=1.0e+00    GT:FAZ:FCZ:FGZ:FTZ:RAZ:RCZ:RGZ:RTZ:PM   0|1:0:60:0:0:0:103:0:0:1.0e+00  0|0:2:221:14:0:1:430:32:0:9.3e-01

As you can see in the example above the ref and alt variants are both 'C'. This appears in a few of my samples and I was wondering if you knew why this might be occurring?

thanks
Arun

@keiranmraine
Copy link
Contributor

Hi, just to follow up on your comment here. See point 1 here for an explanation.

There is a issue open for the correction of this #63

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants