Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci updates #70

Merged
merged 29 commits into from
Jan 27, 2023
Merged

ci updates #70

merged 29 commits into from
Jan 27, 2023

Conversation

a-frantz
Copy link
Member

@a-frantz a-frantz commented Dec 29, 2022

This PR started to address 2 issues:

  1. the CI on master is currently failing.
  2. back in 2021 I opened issue Add CI for checking all tasks pull current docker images #56 and wanted to see that finally addressed. Add CI for checking all tasks pull current docker images #56 is concerned with tasks pulling images that are "out of date" and no longer being built by the CI. Presumably, there's always a good reason we stop building any particular image, so we only want to pull images which are "current". Note this doesn't mean "latest", as some images we maintain multiple versions. Most significantly this applies to the util images, where each version bump comes with more dependencies (and a larger size). So we want to maintain multiple versions and treat them all as valid options.

From there, I also updated the CI to enforce URL WDL imports. This is really just for consistency, because from my testing it looks like relative imports are handled appropriately in all the runners I checked. I'd be fine dropping this specific change.

Part of the reason the CI was failing on master, is because we pull from SEAseq master, which currently has syntax miniwdl check doesn't like. I don't like our CI being dependent on the state of another repo. So me and @adthrasher decided it would be better if we enforce external repos to use tagged releases instead of master.

I then went through and addressed all the issues miniwdl+shellcheck complained about (and found some legitimate bugs in the process).

@a-frantz a-frantz self-assigned this Dec 29, 2022
@a-frantz
Copy link
Member Author

The miniwdl-check CI is going to continue to fail till Modupe fixes the SeaSeq repo. But once those coercions are addressed and she replaces the relative imports in her repo that will start passing (and this repo should be more stable/less dependent on the state of external repos)

.github/workflows/lint-check.yml Show resolved Hide resolved
.github/workflows/lint-check.yml Show resolved Hide resolved

WORKDIR /opt

ARG cellranger_url
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My inclination would be to have the ARG declarations directly under the FROM statement. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my testing (didn't read docs to confirm), the ARG only applies to the RUN command immediately following. At least the build was failing when I had it as you suggested

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that wasn't my understanding, but I don't think it is important enough to hold this up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adthrasher I haven't pushed a cellranger:1.1.0 image. How do we want to handle which version of CellRanger to use? 1.0.0 uses cellranger 7.0.1. The link we provide in the new Dockerfile only supplies whatever the latest version is. Manually push with 7.0.1 because that's what's been tested? Bump to latest?

Copy link
Member

@claymcleod claymcleod Jan 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Andrew Thrasher. I just moved these to the top and tried the docker build, and it worked fine for me.

Also, I don't think we want to be actually packaging cell ranger and distributing it via GitHub packages. @adthrasher have we looked at the licensing at all around this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should probably get their license reviewed. The package is set to internal for now. Even that may not technically be allowable based on the terms of the 10x license.

tools/util.wdl Show resolved Hide resolved
import "https://raw.githubusercontent.com/stjude/seaseq/master/workflows/workflows/mapping.wdl" as seaseq_map
import "https://raw.githubusercontent.com/stjude/seaseq/master/workflows/tasks/seaseq_util.wdl" as seaseq_util
import "https://raw.githubusercontent.com/stjude/seaseq/2.3/workflows/workflows/mapping.wdl" as seaseq_map
import "https://raw.githubusercontent.com/stjude/seaseq/3.0/workflows/tasks/seaseq_util.wdl" as seaseq_util
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should enforce a single version for each external package. I don't recall the differences between 2.3 and 3.0 of SEAseq, but I would probably just pin all three imports to 2.3 if that's what is needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I have no strong opinion either way, will change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want this enforced by CI? Or just enforced during PRs? I think it's fairly obvious during review because all the imports are together.

If we want to enforce by CI, should it be per file or repo wide? As in can one workflow use seaseq/3.0 and another use seaseq/2.3?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update, made this change and turns out "https://raw.githubusercontent.com/stjude/seaseq/2.3/workflows/tasks/seaseq_util.wdl" doesn't exist. So having them all on the same version won't work. We have to revert that change... How important to you is it that they all pull from the same version?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes. Modupe renamed "util.wdl" to "seaseq_util.wdl" at some point. So we may need to keep the mixed versions for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the issue with just using https://raw.githubusercontent.com/stjude/seaseq/3.0/workflows/workflows/mapping.wdl?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v3.0 of SEAseq has some coercions that are considered fine by the spec and dxWDL (which is what the SEAseq team uses as their runner), but cause errors in miniwdl. I'd say this is a case of miniwdl misinterpreting the WDL spec, but regardless, it causes our CI to fail. Rolling back to "seaseq/2.3/workflows/workflows/mapping.wdl" fixes the problem of failing CI.

workflows/chipseq/chipseq-standard.wdl Outdated Show resolved Hide resolved
@a-frantz
Copy link
Member Author

I discovered this morning that "docker://ghcr.io/stjudecloud/fqlib:1.1.0" has been broken (but working in Cromwell until yesterday? Not sure what was going on there, haven't done an investigation). I opted to switch the docker import from the version we maintain to the version "BioContainers" maintains, and plan to entirely drop the fqlib Dockerfile from this repository. It's already maintained by Michael, I don't see any good reason to duplicate our work (and introduce errors) here.

This made me rethink our initial decision to build and maintain all these docker images that already exist in larger repositories. Are we sure we want to be doing this? I'm thinking we should switch as many of our docker images to officially maintained repos as we can. There are a few of our Docker images that alternatives don't exist for, but I don't see the rationale for maintaining tools like samtools and picard when those are available from larger orgs.

What was our original rationale for building everything we use manually (or through CI)? @claymcleod

@a-frantz
Copy link
Member Author

I discovered this morning that "docker://ghcr.io/stjudecloud/fqlib:1.1.0" has been broken (but working in Cromwell until yesterday? Not sure what was going on there, haven't done an investigation). I opted to switch the docker import from the version we maintain to the version "BioContainers" maintains, and plan to entirely drop the fqlib Dockerfile from this repository. It's already maintained by Michael, I don't see any good reason to duplicate our work (and introduce errors) here.

This made me rethink our initial decision to build and maintain all these docker images that already exist in larger repositories. Are we sure we want to be doing this? I'm thinking we should switch as many of our docker images to officially maintained repos as we can. There are a few of our Docker images that alternatives don't exist for, but I don't see the rationale for maintaining tools like samtools and picard when those are available from larger orgs.

Turns out that there's something weird with how singularity and quay.io interact, meaning we can't use images maintained by BioContainers (so we do have good reason to be building our own image of fqlib). I was planning on switching most of our images to those maintained by BioContainers, but that option has been ruled out. I'm sure there's a non-Quay alternative out there for a lot of our tools, so I'd still like to consider dropping a lot of the images we build (I just haven't done the research on the specific alternatives yet).

@adthrasher
Copy link
Member

Turns out that there's something weird with how singularity and quay.io interact, meaning we can't use images maintained by BioContainers (so we do have good reason to be building our own image of fqlib). I was planning on switching most of our images to those maintained by BioContainers, but that option has been ruled out. I'm sure there's a non-Quay alternative out there for a lot of our tools, so I'd still like to consider dropping a lot of the images we build (I just haven't done the research on the specific alternatives yet).

Can you add some additional detail on this? I'm curious exactly what is happening here.

@a-frantz
Copy link
Member Author

Can you add some additional detail on this? I'm curious exactly what is happening here.

$ singularity exec --containall docker://quay.io/biocontainers/fq:0.9.1 pwd
FATAL:   Unable to handle docker://quay.io/biocontainers/fq:0.9.1 uri: failed to get checksum for docker://quay.io/biocontainers/fq:0.9.1: Error reading manifest 0.9.1 in quay.io/biocontainers/fq: manifest unknown: manifest unknown
$ singularity exec --containall docker://quay.io/biocontainers/samtools:latest pwd
FATAL:   Unable to handle docker://quay.io/biocontainers/samtools:latest uri: failed to get checksum for docker://quay.io/biocontainers/samtools:latest: Error reading manifest latest in quay.io/biocontainers/samtools: manifest unknown: manifest unknown

When I saw that error I did some googling, and apparently this is a known issue that Quay.io isn't supported in older version of singularity. The ticket says it's been fixed, and the version of singularity on the cluster is newer than the issue (v3.7.1-1.el7). So I'm not sure exactly what's happening. But every image from BioContainers I tried failed.

@adthrasher
Copy link
Member

Can you add some additional detail on this? I'm curious exactly what is happening here.

$ singularity exec --containall docker://quay.io/biocontainers/fq:0.9.1 pwd
FATAL:   Unable to handle docker://quay.io/biocontainers/fq:0.9.1 uri: failed to get checksum for docker://quay.io/biocontainers/fq:0.9.1: Error reading manifest 0.9.1 in quay.io/biocontainers/fq: manifest unknown: manifest unknown
$ singularity exec --containall docker://quay.io/biocontainers/samtools:latest pwd
FATAL:   Unable to handle docker://quay.io/biocontainers/samtools:latest uri: failed to get checksum for docker://quay.io/biocontainers/samtools:latest: Error reading manifest latest in quay.io/biocontainers/samtools: manifest unknown: manifest unknown

When I saw that error I did some googling, and apparently this is a known issue that Quay.io isn't supported in older version of singularity. The ticket says it's been fixed, and the version of singularity on the cluster is newer than the issue (v3.7.1-1.el7). So I'm not sure exactly what's happening. But every image from BioContainers I tried failed.

Interesting, in the first couple I tried, it worked as expected.

(base) singularity exec --containall docker://quay.io/biocontainers/bioconductor-rsamtools@sha256:0b5b5ad4aba860e0edd1a0a588f21b9e77ed5e8f7b8685a23a7134caa0840b92  pwd
INFO:    Using cached SIF image
WARNING: passwd file doesn't exist in container, not updating
WARNING: group file doesn't exist in container, not updating
/home/athrashe

I was able to run it on the head node and a compute node. My tests with Cromwell also succeeded.

I think you may need to check the tags. I see quay.io/biocontainers/fq:0.9.1--h9ee0642_0, but not without the suffix.

@a-frantz
Copy link
Member Author

Interesting, in the first couple I tried, it worked as expected.

(base) singularity exec --containall docker://quay.io/biocontainers/bioconductor-rsamtools@sha256:0b5b5ad4aba860e0edd1a0a588f21b9e77ed5e8f7b8685a23a7134caa0840b92  pwd
INFO:    Using cached SIF image
WARNING: passwd file doesn't exist in container, not updating
WARNING: group file doesn't exist in container, not updating
/home/athrashe

I was able to run it on the head node and a compute node. My tests with Cromwell also succeeded.

I think you may need to check the tags. I see quay.io/biocontainers/fq:0.9.1--h9ee0642_0, but not without the suffix.

Ah ok. So the tags I expected just aren't there (and apparently there's no default tag if you leave one out). So nothing's wrong, I just made some bad assumptions.

Looks like we still can consider offloading our image maintenance to BioContainers where applicable.

@a-frantz
Copy link
Member Author

I'd like to get this merged ASAP because I just created 3 new branches for other changes based off master, which has failing CI. I think the discussion about whether we want to be maintaining all these images can wait for another PR.

I've tested QC and it works as expected. Is there any other changes/testing we want to see before this gets merged? @adthrasher @claymcleod

@claymcleod
Copy link
Member

What was our original rationale for building everything we use manually (or through CI)? @claymcleod

When we first started the workflows repo, there simply weren't that many official docker containers created. I'm in support of using official Docker containers if they exist.

@a-frantz
Copy link
Member Author

What was our original rationale for building everything we use manually (or through CI)? @claymcleod

When we first started the workflows repo, there simply weren't that many official docker containers created. I'm in support of using official Docker containers if they exist.

Great. I think this PR has grown large enough, so I'll open a new PR to address this specific issue.

@adthrasher
Copy link
Member

adthrasher commented Jan 24, 2023

What was our original rationale for building everything we use manually (or through CI)? @claymcleod

When we first started the workflows repo, there simply weren't that many official docker containers created. I'm in support of using official Docker containers if they exist.

Great. I think this PR has grown large enough, so I'll open a new PR to address this specific issue.

For completeness, @a-frantz and I spoke earlier after he encountered the Docker hub rate limit issue. We can switch to official images if they're hosted in GitHub container registry, quay.io, and possibly others. Docker hub hosted images will continue to be problematic due to pull rate limiting. We could use those as well, but we would need to pay for an appropriate number of Docker subscriptions and ensure authentication is part of our process documentation.

tools/fq.wdl Outdated Show resolved Hide resolved
import "https://raw.githubusercontent.com/stjude/seaseq/master/workflows/workflows/mapping.wdl" as seaseq_map
import "https://raw.githubusercontent.com/stjude/seaseq/master/workflows/tasks/seaseq_util.wdl" as seaseq_util
import "https://raw.githubusercontent.com/stjude/seaseq/2.3/workflows/workflows/mapping.wdl" as seaseq_map
import "https://raw.githubusercontent.com/stjude/seaseq/3.0/workflows/tasks/seaseq_util.wdl" as seaseq_util
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the issue with just using https://raw.githubusercontent.com/stjude/seaseq/3.0/workflows/workflows/mapping.wdl?

workflows/chipseq/chipseq-standard.wdl Outdated Show resolved Hide resolved
workflows/rnaseq/rnaseq-standard-fastq.wdl Show resolved Hide resolved
workflows/rnaseq/rnaseq-standard.wdl Show resolved Hide resolved

WORKDIR /opt

ARG cellranger_url
Copy link
Member

@claymcleod claymcleod Jan 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Andrew Thrasher. I just moved these to the top and tried the docker build, and it worked fine for me.

Also, I don't think we want to be actually packaging cell ranger and distributing it via GitHub packages. @adthrasher have we looked at the licensing at all around this?

@claymcleod
Copy link
Member

Also, are we sure the cellranger image works? I just tried getting inside the Docker container and just running the command and got this:

$ docker run -it cellranger:latest
root@2d6a0b69114f:/opt# cellranger
qemu-x86_64: Could not open '/lib64/ld-linux-x86-64.so.2': No such file or directory
root@2d6a0b69114f:/opt#

@adthrasher
Copy link
Member

Also, are we sure the cellranger image works? I just tried getting inside the Docker container and just running the command and got this:

$ docker run -it cellranger:latest
root@2d6a0b69114f:/opt# cellranger
qemu-x86_64: Could not open '/lib64/ld-linux-x86-64.so.2': No such file or directory
root@2d6a0b69114f:/opt#

Ah yes, I built the image on my mac (apple silicon) without a platform argument. Looking at the image, it's arm64.

        "Architecture": "arm64",

@adthrasher
Copy link
Member

Also, are we sure the cellranger image works? I just tried getting inside the Docker container and just running the command and got this:

$ docker run -it cellranger:latest
root@2d6a0b69114f:/opt# cellranger
qemu-x86_64: Could not open '/lib64/ld-linux-x86-64.so.2': No such file or directory
root@2d6a0b69114f:/opt#

Ah yes, I built the image on my mac (apple silicon) without a platform argument. Looking at the image, it's arm64.

        "Architecture": "arm64",

Rebuilding on a DX node:

dnanexus@job-GP7zy709xbP3Xpk605K4q84g:~/workflows/docker/cellranger/1.1.0$ docker run -it cellranger:latest
root@1b344e701c87:/opt# cellranger
cellranger cellranger-7.1.0
Process 10x Genomics Gene Expression, Feature Barcode, and Immune Profiling data

USAGE:
    cellranger <SUBCOMMAND>

OPTIONS:
    -h, --help       Print help information
    -V, --version    Print version information

SUBCOMMANDS:
    count               Count gene expression (targeted or whole-transcriptome) and/or feature barcode reads from a single sample and GEM well
    multi               Analyze multiplexed data or combined gene expression/immune profiling/feature barcode data
    multi-template      Output a multi config CSV template
    vdj                 Assembles single-cell VDJ receptor sequences from 10x Immune Profiling libraries
    aggr                Aggregate data from multiple Cell Ranger runs
    reanalyze           Re-run secondary analysis (dimensionality reduction, clustering, etc)
    targeted-compare    Analyze targeted enrichment performance by comparing a targeted sample to its cognate parent WTA sample (used as input for targeted gene expression)
    targeted-depth      Estimate targeted read depth values (mean reads per cell) for a specified input parent WTA sample and a target panel CSV file
    mkvdjref            Prepare a reference for use with CellRanger VDJ
    mkfastq             Run Illumina demultiplexer on sample sheets that contain 10x-specific sample index sets
    testrun             Execute the 'count' pipeline on a small test dataset
    mat2csv             Convert a gene count matrix to CSV format
    mkref               Prepare a reference for use with 10x analysis software. Requires a GTF and FASTA
    mkgtf               Filter a GTF file by attribute prior to creating a 10x reference
    upload              Upload analysis logs to 10x Genomics support
    sitecheck           Collect linux system configuration information
    help                Print this message or the help of the given subcommand(s)

@a-frantz
Copy link
Member Author

@adthrasher we still need to push a cellranger:1.1.0 image, and switch to using that version in our tasks. Do you think there would be any problems using v7.1.0 instead of the version you packaged in cellranger:1.0.0? I guess the licensing concerns also need to addressed. Will those hold up merging this PR?

If so, I'd rather just drop the cellranger changes from this PR and move them to the PR I'll make later with other Docker changes.

@adthrasher
Copy link
Member

@adthrasher we still need to push a cellranger:1.1.0 image, and switch to using that version in our tasks. Do you think there would be any problems using v7.1.0 instead of the version you packaged in cellranger:1.0.0? I guess the licensing concerns also need to addressed. Will those hold up merging this PR?

If so, I'd rather just drop the cellranger changes from this PR and move them to the PR I'll make later with other Docker changes.

I am fine bumping to use 7.1.0. The image is currently set to internal, which should be OK and enable us to use this workflow internally. That is a bit of a departure from our normal operations, but we may have no choice.

@a-frantz a-frantz merged commit 808db2f into master Jan 27, 2023
@a-frantz a-frantz deleted the ci branch January 27, 2023 22:28
a-frantz added a commit that referenced this pull request Jan 27, 2023
* master:
  fix: missing import statements
  fix(scrnaseq-standard): missing import statement
  ci updates (#70)
  fix(conf/lsf): bind singularity workdir to cromwell workdir
  chore: remove old unused `file_prefix` task
  feat(tools/fastq_screen): cleanup reference files

# Conflicts:
#	docker/fqlib/1.2.0/Dockerfile
#	tools/fq.wdl
#	workflows/chipseq/chipseq-standard.wdl
#	workflows/rnaseq/rnaseq-standard-fastq.wdl
#	workflows/rnaseq/rnaseq-standard.wdl
#	workflows/scrnaseq/10x-bam-to-fastqs.wdl
#	workflows/scrnaseq/scrnaseq-standard.wdl
a-frantz added a commit that referenced this pull request Jan 27, 2023
* master:
  fix: missing import statements
  fix(scrnaseq-standard): missing import statement
  ci updates (#70)
  fix(conf/lsf): bind singularity workdir to cromwell workdir
  chore: remove old unused `file_prefix` task
a-frantz added a commit that referenced this pull request Jan 27, 2023
* master:
  fix: missing import statements
  fix(scrnaseq-standard): missing import statement
  ci updates (#70)
  fix(conf/lsf): bind singularity workdir to cromwell workdir
  chore: remove old unused `file_prefix` task
  feat(tools/fastq_screen): cleanup reference files
  feat: output md5 checksum for all alignment workflows (#69)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants