Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci updates #70

Merged
merged 29 commits into from
Jan 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
b484279
ci(miniwdl-check): update python
a-frantz Dec 29, 2022
1dd1caa
fix: make shellcheck/miniwdl check happy
a-frantz Dec 29, 2022
b047532
fix(tools/cellranger): rename cpu param to ncpu
a-frantz Dec 29, 2022
c7ed2bf
fix(toos/cellranger): use new param name
a-frantz Dec 29, 2022
a028cf3
fix(tools/cellranger): use new param in runtime
a-frantz Dec 29, 2022
5298ec0
fix: import `10x-bam-to-fastqs` from master URL
a-frantz Dec 29, 2022
345838b
feat: add CI for ensuring current docker image is being pulled (#56)
a-frantz Dec 30, 2022
8fe2c54
ci: lowercase directories in valid image check
a-frantz Dec 30, 2022
755fa1c
feat(docker/cellranger): generalize Dockerfile using build-args
a-frantz Dec 30, 2022
17ad22e
Revert "feat(docker/cellranger): generalize Dockerfile using build-args"
a-frantz Dec 30, 2022
5bb6207
feat(docker/cellranger): generalize Dockerfile using build-args
a-frantz Dec 30, 2022
14764b0
feat(workflows): pull from tagged versions of external repos
a-frantz Dec 30, 2022
573e646
ci(lint-check): change import check to URL and from this repo master
a-frantz Dec 31, 2022
adddbe8
ci(lint-check): fix early exit
a-frantz Dec 31, 2022
68b3264
ci(lint-check): add check that external repos aren't pulling from master
a-frantz Jan 1, 2023
5360f9a
fix: changes to make miniwdl+shellcheck happy
a-frantz Jan 3, 2023
cff3c0f
chore: point to this branch for testing
a-frantz Jan 6, 2023
43ae103
fix: somehow 'ghcr.io/stjudecloud/fqlib:1.1.0' broke
a-frantz Jan 13, 2023
a156f11
fix: apply suggestions from @adthrasher
a-frantz Jan 13, 2023
b579d71
fix: point back to `seaseq/3.0` where possible
a-frantz Jan 13, 2023
bcd3df1
feat: bump to newest `fq`
a-frantz Jan 15, 2023
d935b72
chore: point `fq` to use docker image built in this repo (and branch)
a-frantz Jan 15, 2023
1529b01
fix(tools/bwa): bad "read_group" WDL
a-frantz Jan 15, 2023
49beec6
chore: point back to master
a-frantz Jan 24, 2023
67ec206
fix: don't quote $head_arg, as we want word splitting
a-frantz Jan 24, 2023
cbbfbdd
chore: switch all uses of util:1.0.0 to 1.2.0 and rm util/1.0.0
a-frantz Jan 24, 2023
1d7ad96
chore: mv ARG to top of file and add ENTRYPOINT
a-frantz Jan 24, 2023
f32cedc
chore: rm old cellranger image and point to new one
a-frantz Jan 24, 2023
575a6a8
Merge branch 'master' into ci
a-frantz Jan 27, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 44 additions & 4 deletions .github/workflows/lint-check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,29 @@ jobs:
EXITCODE=0
for file in $(find . -name '*.wdl'); do
>&2 echo "Checking file $file..."
import_lines=$(awk '/import/' "$file")
bad_lines=$(echo "$import_lines" | awk '!/https:\/\/raw.githubusercontent.com\/stjudecloud\/workflows\/master/ && !/https:\/\/raw.githubusercontent.com\/stjude\/xenocp\/master/ && !/https:\/\/raw.githubusercontent.com\/stjude\/seaseq\/master/' | grep -v '# lint-check: ignore') || true
import_lines=$(awk '/import/' "$file" | grep -v '# lint-check: ignore') || true
a-frantz marked this conversation as resolved.
Show resolved Hide resolved

bad_lines=$(echo "$import_lines" | awk '!/https:/')
if [ -n "$bad_lines" ]; then
>&2 echo "Must import files from the master branch on Github."
>&2 echo "Must import files from a URL!"
>&2 echo "The following lines are bad:"
>&2 echo "$bad_lines"
>&2 echo ""
EXITCODE=1
fi

bad_lines=$(echo "$import_lines" | awk '/https:\/\/raw.githubusercontent.com\/stjudecloud\/workflows/ && !/master/') || true
if [ -n "$bad_lines" ]; then
>&2 echo "Imports from this repo must use the master branch!"
>&2 echo "The following lines are bad:"
>&2 echo "$bad_lines"
>&2 echo ""
EXITCODE=1
fi

bad_lines=$(echo "$import_lines" | awk '!/https:\/\/raw.githubusercontent.com\/stjudecloud\/workflows/ && /master/') || true
if [ -n "$bad_lines" ]; then
>&2 echo "Imports from external repos must use a tagged release!"
>&2 echo "The following lines are bad:"
>&2 echo "$bad_lines"
>&2 echo ""
Expand All @@ -33,9 +52,13 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Ensure SemVer'd docker images are being pulled
- name: Ensure current SemVer'd docker images are being pulled
a-frantz marked this conversation as resolved.
Show resolved Hide resolved
run: |
EXITCODE=0
files=$(find docker/ -name Dockerfile)
for file in $files; do
echo "$file" | awk -F '/' '{print tolower($2) ":" $3}' >> valid_dockerimages.txt
done
files=$(find . -name '*.wdl')
for file in $files; do
while IFS= read -r line; do
Expand All @@ -46,6 +69,23 @@ jobs:
>&2 echo "In file: $file"
EXITCODE=1
fi

tool=$(echo "$line" | awk -F ':' '{print substr($2, 23, length($2)) ":" substr($3, 1, length($3)-1)}')
case `grep -Fx "$tool" valid_dockerimages.txt >/dev/null; echo $?` in
0)
# valid docker image
;;
1)
>&2 echo "Must use a current Docker image"
>&2 echo "Offending line: $line"
>&2 echo "In file: $file"
EXITCODE=1
;;
*)
>&2 echo "Something went wrong while checking for current Docker images!"
EXITCODE=2
;;
esac
done < <(awk '/docker: .*stjudecloud/' < "$file")
done
exit $EXITCODE
2 changes: 1 addition & 1 deletion .github/workflows/miniwdl-check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ jobs:
- name: Set up Python
uses: actions/setup-python@v1
with:
python-version: '3.6'
python-version: '3.10'
- name: Install miniwdl
run: |
python -m pip install --upgrade pip
Expand Down
16 changes: 0 additions & 16 deletions docker/cellranger/1.0.0/Dockerfile

This file was deleted.

24 changes: 24 additions & 0 deletions docker/cellranger/1.1.0/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Supply a valid download link and md5sum from "https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/latest"

FROM ubuntu:20.04

ARG cellranger_url
ARG cellranger_md5

RUN apt-get update && \
apt-get upgrade -y && \
apt-get install curl -y && \
rm -r /var/lib/apt/lists/*

WORKDIR /opt

RUN curl -o cellranger.tar.gz \
${cellranger_url} \
&& echo "${cellranger_md5} cellranger.tar.gz" > cellranger.tar.gz.md5 \
&& md5sum -c cellranger.tar.gz.md5 \
&& tar -xzvf cellranger.tar.gz \
&& mv cellranger-* cellranger

ENV PATH "/opt/cellranger:$PATH"
a-frantz marked this conversation as resolved.
Show resolved Hide resolved

ENTRYPOINT [ "cellranger" ]
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@ FROM rust:1.59.0 as fqlib-builder

RUN cargo install \
--git https://github.com/stjude-rust-labs/fq.git \
--tag v0.8.0 \
--tag v0.9.1 \
--root /opt/fqlib/

FROM ubuntu:18.04 as builder
FROM ubuntu:20.04 as builder

COPY --from=fqlib-builder /opt/fqlib/bin/fq /usr/local/bin/

Expand Down
6 changes: 0 additions & 6 deletions docker/util/1.0.0/Dockerfile

This file was deleted.

26 changes: 13 additions & 13 deletions tools/bwa.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -37,11 +37,11 @@ task bwa_aln {
tar -C bwa -xzf ~{bwadb_tar_gz}
PREFIX=$(basename bwa/*.ann ".ann")

bwa aln -t ${n_cores} bwa/$PREFIX ~{fastq} > sai
bwa aln -t "${n_cores}" bwa/"$PREFIX" ~{fastq} > sai

bwa samse \
~{"-r '" + read_group}~{true="'" false="" defined(read_group)} \
bwa/$PREFIX sai ~{fastq} | samtools view -@ ${n_cores} -hb - > ~{output_bam}
~{if read_group != "" then "-r '" else ""}~{read_group}~{if read_group != "" then "'" else ""} \
bwa/"$PREFIX" sai ~{fastq} | samtools view -@ "${n_cores}" -hb - > ~{output_bam}
>>>

runtime {
Expand Down Expand Up @@ -101,12 +101,12 @@ task bwa_aln_pe {
tar -C bwa -xzf ~{bwadb_tar_gz}
PREFIX=$(basename bwa/*.ann ".ann")

bwa aln -t ${n_cores} bwa/$PREFIX ~{fastq1} > sai_1
bwa aln -t ${n_cores} bwa/$PREFIX ~{fastq2} > sai_2
bwa aln -t "${n_cores}" bwa/"$PREFIX" ~{fastq1} > sai_1
bwa aln -t "${n_cores}" bwa/"$PREFIX" ~{fastq2} > sai_2

bwa sampe \
~{"-r '" + read_group}~{true="'" false="" defined(read_group)} \
bwa/$PREFIX sai_1 sai_2 ~{fastq1} ~{fastq2} | samtools view -@ ${n_cores} -hb - > ~{output_bam}
~{if read_group != "" then "-r '" else ""}~{read_group}~{if read_group != "" then "'" else ""} \
bwa/"$PREFIX" sai_1 sai_2 ~{fastq1} ~{fastq2} | samtools view -@ "${n_cores}" -hb - > ~{output_bam}
>>>

runtime {
Expand Down Expand Up @@ -167,9 +167,9 @@ task bwa_mem {
PREFIX=$(basename bwa/*.ann ".ann")

bwa mem \
-t $n_cores \
~{"-R '" + read_group}~{true="'" false="" defined(read_group)} \
bwa/$PREFIX ~{fastq} | samtools view -b - > ~{output_bam}
-t "$n_cores" \
~{if read_group != "" then "-r '" else ""}~{read_group}~{if read_group != "" then "'" else ""} \
bwa/"$PREFIX" ~{fastq} | samtools view -b - > ~{output_bam}
>>>

runtime {
Expand Down Expand Up @@ -213,11 +213,11 @@ task build_bwa_db {

orig_fasta=~{reference_fasta}
ref_fasta=$(basename "${orig_fasta%.gz}")
gunzip -c ~{reference_fasta} > $ref_fasta || cp ~{reference_fasta} $ref_fasta
gunzip -c ~{reference_fasta} > "$ref_fasta" || cp ~{reference_fasta} "$ref_fasta"

bwa index $ref_fasta
bwa index "$ref_fasta"

tar -czf ~{bwadb_out_name} ${ref_fasta}*
tar -czf ~{bwadb_out_name} "${ref_fasta}*"
>>>

runtime {
Expand Down
47 changes: 35 additions & 12 deletions tools/cellranger.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ task count {
String id
File transcriptome_tar_gz
File fastqs_tar_gz
String sample_id
Int cpu = 8
String? sample_id
Int ncpu = 8
Int memory_gb = 16
String jobmode = "local"
Int max_retries = 1
Expand All @@ -20,24 +20,37 @@ task count {

Float fastq_size = size(fastqs_tar_gz, "GiB")
Int disk_size = ceil((fastq_size * 2) + 10)
String parsed_detect_nproc = if detect_nproc then "true" else ""

command <<<
set -euo pipefail

n_cores=~{ncpu}
if [ -n ~{parsed_detect_nproc} ]
then
n_cores=$(nproc)
fi

mkdir transcriptome_dir
tar zxf ~{transcriptome_tar_gz} -C transcriptome_dir --strip-components 1

mkdir fastqs
tar zxf ~{fastqs_tar_gz} -C fastqs

files=(fastqs/*.fastq.gz)
sample_id=$(basename ${files[0]} "_S1_L001_R1_001.fastq.gz")
if [ -z ~{sample_id} ]; then
files=(fastqs/*.fastq.gz)
# expected sample name extension comes from:
# https://support.illumina.com/content/dam/illumina-support/documents/documentation/software_documentation/bcl2fastq/bcl2fastq2-v2-20-software-guide-15051736-03.pdf
sample_id="$(basename "${files[0]}" '_S1_L001_R1_001.fastq.gz')"
fi

cellranger count \
--id ~{id} \
--transcriptome transcriptome_dir \
--fastqs fastqs \
--sample ${sample_id} \
--sample "${sample_id}" \
--jobmode ~{jobmode} \
--localcores ~{cpu} \
--localcores "$n_cores" \
--localmem ~{memory_gb} \
--disable-ui

Expand All @@ -47,9 +60,9 @@ task count {
runtime {
memory: memory_gb + " GB"
disk: disk_size + " GB"
docker: "ghcr.io/stjudecloud/cellranger:1.0.0"
docker: "ghcr.io/stjudecloud/cellranger:1.1.0"
maxRetries: max_retries
cpu: cpu
cpu: ncpu
}

output {
Expand Down Expand Up @@ -88,10 +101,11 @@ task bamtofastq {
File bam
Int ncpu = 4
Int memory_gb = 8
Int max_retries = 1
Boolean cellranger11 = false
Boolean longranger20 = false
Boolean gemcode = false
Boolean detect_nproc = false
Int max_retries = 1
}

Float bam_size = size(bam, "GiB")
Expand All @@ -101,17 +115,26 @@ task bamtofastq {
else if (longranger20) then "--lr10"
else if (gemcode) then "--gemcode"
else ""
String parsed_detect_nproc = if detect_nproc then "true" else ""

command <<<
bamtofastq --nthreads ~{ncpu} ~{data_arg} ~{bam} fastqs
set -euo pipefail

n_cores=~{ncpu}
if [ -n ~{parsed_detect_nproc} ]
then
n_cores=$(nproc)
fi

bamtofastq --nthreads "$n_cores" ~{data_arg} ~{bam} fastqs
cd fastqs/*/
tar -zcf archive.tar.gz *.fastq.gz
tar -zcf archive.tar.gz ./*.fastq.gz
>>>

runtime {
memory: memory_gb + " GB"
disk: disk_size + " GB"
docker: "ghcr.io/stjudecloud/cellranger:1.0.0"
docker: "ghcr.io/stjudecloud/cellranger:1.1.0"
maxRetries: max_retries
cpu: ncpu
}
Expand Down
2 changes: 1 addition & 1 deletion tools/estimate.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ END
runtime {
memory: "4 GB"
disk: "4 GB"
docker: 'ghcr.io/stjudecloud/util:1.1.0'
docker: 'ghcr.io/stjudecloud/util:1.2.0'
maxRetries: max_retries
}

Expand Down
2 changes: 1 addition & 1 deletion tools/fq.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ task fqlint {
runtime {
disk: disk_size + " GB"
memory: memory_gb + " GB"
docker: 'ghcr.io/stjudecloud/fqlib:1.0.1'
docker: 'ghcr.io/stjudecloud/fqlib:1.2.0'
maxRetries: max_retries
}

Expand Down
4 changes: 2 additions & 2 deletions tools/md5sum.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ task compute_checksum {
runtime {
disk: disk_size + " GB"
memory: memory_gb + " GB"
docker: 'ghcr.io/stjudecloud/util:1.0.0'
docker: 'ghcr.io/stjudecloud/util:1.2.0'
maxRetries: max_retries
}

Expand Down Expand Up @@ -59,7 +59,7 @@ task check_checksum {

runtime {
disk: disk_size + " GB"
docker: 'ghcr.io/stjudecloud/util:1.0.0'
docker: 'ghcr.io/stjudecloud/util:1.2.0'
maxRetries: max_retries
}

Expand Down
2 changes: 1 addition & 1 deletion tools/ngsderive.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ task junction_annotation {
-o ~{prefix}.junction_summary.txt \
~{bam}

mv $(basename ~{bam}.junctions.tsv) ~{prefix}.junctions.tsv
mv "$(basename ~{bam}.junctions.tsv)" "~{prefix}.junctions.tsv"
gzip ~{prefix}.junctions.tsv
}

Expand Down
Loading