-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci updates #70
ci updates #70
Conversation
This reverts commit 755fa1c.
(keep old cellranger Dockerfile)
allows external repos to pull from any branch/tag
The miniwdl-check CI is going to continue to fail till Modupe fixes the SeaSeq repo. But once those coercions are addressed and she replaces the relative imports in her repo that will start passing (and this repo should be more stable/less dependent on the state of external repos) |
docker/cellranger/1.1.0/Dockerfile
Outdated
|
||
WORKDIR /opt | ||
|
||
ARG cellranger_url |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My inclination would be to have the ARG
declarations directly under the FROM
statement. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my testing (didn't read docs to confirm), the ARG
only applies to the RUN
command immediately following. At least the build was failing when I had it as you suggested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, that wasn't my understanding, but I don't think it is important enough to hold this up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adthrasher I haven't pushed a cellranger:1.1.0
image. How do we want to handle which version of CellRanger to use? 1.0.0
uses cellranger 7.0.1
. The link we provide in the new Dockerfile only supplies whatever the latest version is. Manually push with 7.0.1
because that's what's been tested? Bump to latest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Andrew Thrasher. I just moved these to the top and tried the docker build, and it worked fine for me.
Also, I don't think we want to be actually packaging cell ranger and distributing it via GitHub packages. @adthrasher have we looked at the licensing at all around this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we should probably get their license reviewed. The package is set to internal for now. Even that may not technically be allowable based on the terms of the 10x license.
import "https://raw.githubusercontent.com/stjude/seaseq/master/workflows/workflows/mapping.wdl" as seaseq_map | ||
import "https://raw.githubusercontent.com/stjude/seaseq/master/workflows/tasks/seaseq_util.wdl" as seaseq_util | ||
import "https://raw.githubusercontent.com/stjude/seaseq/2.3/workflows/workflows/mapping.wdl" as seaseq_map | ||
import "https://raw.githubusercontent.com/stjude/seaseq/3.0/workflows/tasks/seaseq_util.wdl" as seaseq_util |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should enforce a single version for each external package. I don't recall the differences between 2.3 and 3.0 of SEAseq, but I would probably just pin all three imports to 2.3 if that's what is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. I have no strong opinion either way, will change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want this enforced by CI? Or just enforced during PRs? I think it's fairly obvious during review because all the imports are together.
If we want to enforce by CI, should it be per file or repo wide? As in can one workflow use seaseq/3.0
and another use seaseq/2.3
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update, made this change and turns out "https://raw.githubusercontent.com/stjude/seaseq/2.3/workflows/tasks/seaseq_util.wdl" doesn't exist. So having them all on the same version won't work. We have to revert that change... How important to you is it that they all pull from the same version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes. Modupe renamed "util.wdl" to "seaseq_util.wdl" at some point. So we may need to keep the mixed versions for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the issue with just using https://raw.githubusercontent.com/stjude/seaseq/3.0/workflows/workflows/mapping.wdl
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
v3.0
of SEAseq has some coercions that are considered fine by the spec and dxWDL
(which is what the SEAseq team uses as their runner), but cause errors in miniwdl
. I'd say this is a case of miniwdl
misinterpreting the WDL spec, but regardless, it causes our CI to fail. Rolling back to "seaseq/2.3/workflows/workflows/mapping.wdl" fixes the problem of failing CI.
haven't investigated, just switching to (hopefully) more stable `quay`
I discovered this morning that "docker://ghcr.io/stjudecloud/fqlib:1.1.0" has been broken (but working in Cromwell until yesterday? Not sure what was going on there, haven't done an investigation). I opted to switch the docker import from the version we maintain to the version "BioContainers" maintains, and plan to entirely drop the This made me rethink our initial decision to build and maintain all these docker images that already exist in larger repositories. Are we sure we want to be doing this? I'm thinking we should switch as many of our docker images to officially maintained repos as we can. There are a few of our Docker images that alternatives don't exist for, but I don't see the rationale for maintaining tools like What was our original rationale for building everything we use manually (or through CI)? @claymcleod |
Turns out that there's something weird with how singularity and |
Can you add some additional detail on this? I'm curious exactly what is happening here. |
When I saw that error I did some googling, and apparently this is a known issue that Quay.io isn't supported in older version of singularity. The ticket says it's been fixed, and the version of singularity on the cluster is newer than the issue (v3.7.1-1.el7). So I'm not sure exactly what's happening. But every image from BioContainers I tried failed. |
Interesting, in the first couple I tried, it worked as expected.
I was able to run it on the head node and a compute node. My tests with Cromwell also succeeded. I think you may need to check the tags. I see |
Ah ok. So the tags I expected just aren't there (and apparently there's no default tag if you leave one out). So nothing's wrong, I just made some bad assumptions. Looks like we still can consider offloading our image maintenance to BioContainers where applicable. |
I'd like to get this merged ASAP because I just created 3 new branches for other changes based off I've tested QC and it works as expected. Is there any other changes/testing we want to see before this gets merged? @adthrasher @claymcleod |
When we first started the workflows repo, there simply weren't that many official docker containers created. I'm in support of using official Docker containers if they exist. |
Great. I think this PR has grown large enough, so I'll open a new PR to address this specific issue. |
For completeness, @a-frantz and I spoke earlier after he encountered the Docker hub rate limit issue. We can switch to official images if they're hosted in GitHub container registry, quay.io, and possibly others. Docker hub hosted images will continue to be problematic due to pull rate limiting. We could use those as well, but we would need to pay for an appropriate number of Docker subscriptions and ensure authentication is part of our process documentation. |
import "https://raw.githubusercontent.com/stjude/seaseq/master/workflows/workflows/mapping.wdl" as seaseq_map | ||
import "https://raw.githubusercontent.com/stjude/seaseq/master/workflows/tasks/seaseq_util.wdl" as seaseq_util | ||
import "https://raw.githubusercontent.com/stjude/seaseq/2.3/workflows/workflows/mapping.wdl" as seaseq_map | ||
import "https://raw.githubusercontent.com/stjude/seaseq/3.0/workflows/tasks/seaseq_util.wdl" as seaseq_util |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the issue with just using https://raw.githubusercontent.com/stjude/seaseq/3.0/workflows/workflows/mapping.wdl
?
docker/cellranger/1.1.0/Dockerfile
Outdated
|
||
WORKDIR /opt | ||
|
||
ARG cellranger_url |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Andrew Thrasher. I just moved these to the top and tried the docker build, and it worked fine for me.
Also, I don't think we want to be actually packaging cell ranger and distributing it via GitHub packages. @adthrasher have we looked at the licensing at all around this?
Also, are we sure the cellranger image works? I just tried getting inside the Docker container and just running the command and got this:
|
Ah yes, I built the image on my mac (apple silicon) without a platform argument. Looking at the image, it's
|
Rebuilding on a DX node:
|
@adthrasher we still need to push a If so, I'd rather just drop the cellranger changes from this PR and move them to the PR I'll make later with other Docker changes. |
I am fine bumping to use 7.1.0. The image is currently set to |
* master: fix: missing import statements fix(scrnaseq-standard): missing import statement ci updates (#70) fix(conf/lsf): bind singularity workdir to cromwell workdir chore: remove old unused `file_prefix` task feat(tools/fastq_screen): cleanup reference files # Conflicts: # docker/fqlib/1.2.0/Dockerfile # tools/fq.wdl # workflows/chipseq/chipseq-standard.wdl # workflows/rnaseq/rnaseq-standard-fastq.wdl # workflows/rnaseq/rnaseq-standard.wdl # workflows/scrnaseq/10x-bam-to-fastqs.wdl # workflows/scrnaseq/scrnaseq-standard.wdl
* master: fix: missing import statements fix(scrnaseq-standard): missing import statement ci updates (#70) fix(conf/lsf): bind singularity workdir to cromwell workdir chore: remove old unused `file_prefix` task
* master: fix: missing import statements fix(scrnaseq-standard): missing import statement ci updates (#70) fix(conf/lsf): bind singularity workdir to cromwell workdir chore: remove old unused `file_prefix` task feat(tools/fastq_screen): cleanup reference files feat: output md5 checksum for all alignment workflows (#69)
This PR started to address 2 issues:
master
is currently failing.util
images, where each version bump comes with more dependencies (and a larger size). So we want to maintain multiple versions and treat them all as valid options.From there, I also updated the CI to enforce URL WDL imports. This is really just for consistency, because from my testing it looks like relative imports are handled appropriately in all the runners I checked. I'd be fine dropping this specific change.
Part of the reason the CI was failing on
master
, is because we pull from SEAseqmaster
, which currently has syntaxminiwdl check
doesn't like. I don't like our CI being dependent on the state of another repo. So me and @adthrasher decided it would be better if we enforce external repos to use tagged releases instead ofmaster
.I then went through and addressed all the issues
miniwdl
+shellcheck
complained about (and found some legitimate bugs in the process).