- Bug fix: Workflows that have no associated smufin inputs will not fail.
- Bug fix: Workflows that have a mix of download methods including filesystemCopy and gtdownload will not fail.
-
The filename of the output minibam has changed. The name will now look like this:
${tumour or normal aliquot id}.${"tumour" or "normal"}.variantbam.${date}.bam
For example:
7d7205e8-d864-11e3-be46-bd5eb93a18bb.tumour.variantbam.20160721.bam
or:
9d7d05b8-d564-21a5-bf47-ad5ec83214zc.normal.variantbam.20160721.bam
If you are still using the old naming convention, the upload should contain a symbolic link with that name pointing to the minibam file with the new name.
-
Files can now be "downloaded" (copied, really) from a locally accessible file system. Also, BAMs and VCFs can be downloaded in different ways. As an example:
vcfDownloadMethod=icgcStorageClient # To copy a file from the file system set the download method to "filesystemCopy". bamDownloadMethod=filesystemCopy # Also, you will need to include this property which is the root path to the files. fileSystemSourcePath = /files_for_workflow/ # When the workflow runs, VCFs will be downloaded using the ICGC Storage Client tool, and BAMs will be copied from /files_for_workflow. # The normal data file will be copied from /files_for_workflow/my_normal_bam_file.bam into the /datastore/bam/normal directory. normal_data_file_name = my_normal_bam_file.bam
It is also possible to specify a pipeline-specific download method to override the defaule one. For example:
```
vcfDownloadMethod=icgcStorageClient
smufinDownloadMethod=filesystemCopy
```
This will result in all VCFs being downloaded using the ICGC Storage Client, *except* for the smufin VCFs which will be copied from a path on the filesystem (rooted at the value specified by `fileSystemSourcePath`).
Note that if `bamDownloadMethod` is not specified, BAMs will be downloaded using the method specified by `vcfDownloadMethod`.
- Smufin INDEL VCFs can now be included in the workflow. Mainly this was because there was a need to re-generate the minibams to include smufin results. OxoG could be run with smufin inputs but this has not been extensively tested. To include smufin files, simply add them like any other pipeline VCF:
Note that the filename and object IDs must match: smufin files are not in GNOS so they won't have an actual GNOS object ID so for smufin, the object id IS the filename.
smufin_indel_index_file_name_0 = 123-456-789.smufin.20160114.somatic.indel.vcf.gz.tbi smufin_indel_index_object_id_0 = 123-456-789.smufin.20160114.somatic.indel.vcf.gz.tbi smufin_indel_data_file_name_0 = 123-456-789.smufin.20160114.somatic.indel.vcf.gz smufin_indel_data_object_id_0 = 123-456-789.smufin.20160114.somatic.indel.vcf.gz
- Bug fix: the check_minibam script was checking germline files which caused it to fail. This is only a problem when using gtdownload to download files because the other download methods download only the necessary files, but gtdownload downloads the complete fileset including germline files.
- Bug fix: move the JSON file to the failed-jobs directory if the check_minibam script fails.
- Fixed a bug that would have cause the workflow to crash in the case that there were SNVs extracted from an INDEL VCF.
- Fixes:
- In situations with multiple tumours, the merged VCFs did not contain data from all tumours, they were only the merge across pipelines of each tumour. This has been corrected: there will now be a single set of merged VCFs which are a merge-by-type across all pipelines across all tumours.
- Naming of call_stats and gnos_files in output followed incorrect naming convetion. This has been corrected.
- Changes to the OxoG docker image from Dimitri. This is supposed to fix the OxoG MAF issues. New image name is "oxog:160428".
- Bugfixes:
- for gtdownload:
- properly index the URLs
- use the correct key for downloading BAMs
- Remove possible "/datafiles/VCF/..." prefix from files_for_upload
- for gtdownload:
- VCFs can now be missing and the workflow will continue. To use this feature, add
allowMissingFiles=true
to your INI file.
- OxoG can now handle donors with multiple tumours!
NOTE: INI files generated by version 1.1.5 or earlier will NOT be compatible with this version of the workflow. Regenerate your INI files using the updated INI generator in this version of the workflow. INIs generated with this version are NOT compatible with earlier versions of the workflow.
- Now use OxoG container "oxog:160329" which contains a new version of variantbam. This new version of variantbam fixes an issue where it was generating stats and encountered the value "-1" in the NM tag and failed.
- Hotfix: Fixed a bug where the Sanger SNV index file was being referenced instead of the SV file. This was causing problems for Workers that used the S3 download method as two index files would get downloaded with the same name, and then the job that stats all files to ensure that they all exist would fail.
- stat all files that have been downloaded - sometimes gtdownload will not download a file but the exit code is still 0 so we must
stat
the files to make sure they were actually downloaded properly. - gtdownload process now takes two keys: one for BAMs and once for VCFs. This is for situations where BAMs and VCFs are not hosted in the same place and require different keys for download.
- fixes for git_mv script: updated file should now be committed instead of overwritten.
- Fixes to git_mv script
- Use version 1.0.13 of icgc-storage-client
- Fix for issue with AWS credentials being clobbered by the launcher
- Fix git_mv to set timestamps inside loop
- Input files can now be downloaded using gtdownload or AWS CLI. To do this, add a new property to you INI file: downloadMethod.
This can have one of three options:
- gtdownload
- icgcStorageClient
- s3 (You will need to put your AWS credentials in ~/.gnos on the launcher for this to work).
- The timestamps of workflow state transitions will be injected in the JSON files before they are moved in github. They will look something like this: "transition_to_downloading-jobs_timestamp":"2016-03-18T15:00:00"
- Workflow now uses icgc-storage-client version 1.0.12
- Fixed a bug in git_move.py
- Changed pre-processing to replace leading M with MT in the CHROM field. This was causing bcf-tools norm to fail. At least one Broad INDEL had M (which is not valid) instead of MT, so it will be fixed at workflow run-time.
- Fixed the git_move.py script to move files in git better.
Main changes:
- Add
git reset --hard origin/master
to git move scripts. - Add
-t
and-u
to rsync command, trying to resolve the manifest-newer-than-igto issue that Jonthan reported.
Other changes:
- update ini_generator.sh script with more current default values.
- Code cleanup in INI Generator.
- Update Dockerfile (even though it's not used right now).
- Update SeqWare artifact dependency to seqware-bin-linux-x86-64-jre-8.0.45
Initial release.