Skip to content

Conversation

@lebovic
Copy link
Member

@lebovic lebovic commented Dec 16, 2021

Adds:

Modifies:

Integration tests: passing
Unit tests: passing
Manual tests: passing

bcai2 and others added 10 commits December 8, 2021 12:48
Adds support for inputs that are provided as AWS S3 URIs.

Parallelization is disabled for any job that uses an S3 URI input, since this would require redownloading the file from S3, splicing locally, and reuploading the spliced inputs to S3.

Note: Uploading is skipped since the input file will already be in S3. However, the status is still updated to TRANSFERRING_FROM_CLIENT and TRANSFERRING_TO_CLIENT, like the status path for a locally uploaded file.
Adds a check on the size of an input provided as an S3 URI, as a safeguard. The limit is the same as the size of a user-uploaded file (4.5 GB).
Modifies:
- Refactors propagation of S3 tags for inputs in tools/tool.py and api/query.py.
- Enforces uniform naming scheme of variables/methods (inputs_are_in_s3(), input_is_in_s3).
Adds:
- More STAR args
- Add multiple levels of tool_arg handling (whitelist, dangerlist, blacklist)
- Error on unknown or blacklisted args
- Reduce complexity (validation and parallelization for now) if a dangerous argument is passed
Modifies:
- Test tool per-file limit from 4.5 GB (default) to 256 GB
… sizes (#63)

Modifies:
- What values is used to limit per-tool upload sizes
Modifies:
- Actually download files
@lebovic lebovic changed the title v0.7.37 v0.7.x Dec 16, 2021
@lebovic lebovic marked this pull request as ready for review December 17, 2021 19:12
lebovic and others added 2 commits December 20, 2021 12:29
…across the board (#68)

Adds:
- Output file compression across the board, except if parallelism is actively being used.
- Explicit `parallelize=True` flag.
- 32 GB of pure EBS disk for shi7 and an increase limit of 5GB per file. This can be trivially increased.

Modifies:
- Output file format for many tools.

Removes:
- cutadapt (no longer in use)
- shi7 parallelization (no longer needed)
- shogun parallelization (no longer needed)

Associated design doc: https://docs.google.com/document/d/1vn26gPgLHvSqDREpLXoFp8LV_5wDhqWRGXkCBuKTknM/edit#
Sets the default database for bowtie2 to GRCh38_noalt_as (the no-alt GRCh38 analysis set), found at https://benlangmead.github.io/aws-indexes/bowtie.

Co-authored-by: Noah Lebovic <noah@lebovic.com>
@lebovic lebovic changed the title v0.7.x v0.7.39 Dec 20, 2021
@lebovic lebovic merged commit 2ce77f4 into main Dec 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants