-
Notifications
You must be signed in to change notification settings - Fork 0
v0.7.39 #62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Adds support for inputs that are provided as AWS S3 URIs. Parallelization is disabled for any job that uses an S3 URI input, since this would require redownloading the file from S3, splicing locally, and reuploading the spliced inputs to S3. Note: Uploading is skipped since the input file will already be in S3. However, the status is still updated to TRANSFERRING_FROM_CLIENT and TRANSFERRING_TO_CLIENT, like the status path for a locally uploaded file.
Adds a check on the size of an input provided as an S3 URI, as a safeguard. The limit is the same as the size of a user-uploaded file (4.5 GB).
Adds: - More STAR args - Add multiple levels of tool_arg handling (whitelist, dangerlist, blacklist) - Error on unknown or blacklisted args - Reduce complexity (validation and parallelization for now) if a dangerous argument is passed
Modifies: - Test tool per-file limit from 4.5 GB (default) to 256 GB
… sizes (#63) Modifies: - What values is used to limit per-tool upload sizes
Modifies: - Actually download files
Modifies: - The STAR per-file limit
…across the board (#68) Adds: - Output file compression across the board, except if parallelism is actively being used. - Explicit `parallelize=True` flag. - 32 GB of pure EBS disk for shi7 and an increase limit of 5GB per file. This can be trivially increased. Modifies: - Output file format for many tools. Removes: - cutadapt (no longer in use) - shi7 parallelization (no longer needed) - shogun parallelization (no longer needed) Associated design doc: https://docs.google.com/document/d/1vn26gPgLHvSqDREpLXoFp8LV_5wDhqWRGXkCBuKTknM/edit#
Sets the default database for bowtie2 to GRCh38_noalt_as (the no-alt GRCh38 analysis set), found at https://benlangmead.github.io/aws-indexes/bowtie. Co-authored-by: Noah Lebovic <noah@lebovic.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Adds:
.tar.gzs across the board (Standardize output file interface and enable output file compression across the board #68)parallelize=Trueflag (Standardize output file interface and enable output file compression across the board #68)Modifies:
bowtie2DB to GRCh38 #67)Integration tests: passing
Unit tests: passing
Manual tests: passing