-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add gene coverage columns during ingest workflow #36
Commits on Apr 16, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 9839bd1 - Browse repository at this point
Copy the full SHA 9839bd1View commit details -
Add genome_coverage and indicator (True/blank) variable for E_coverage
This is using the Nextclade "coverage" as "genome_coverage" and the Nextclade "failedCdses" to check if E_coverage is present or not. fixup: use 1 instead of true
Configuration menu - View commit details
-
Copy full SHA for 2a9eee4 - Browse repository at this point
Copy the full SHA 2a9eee4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 12947fa - Browse repository at this point
Copy the full SHA 12947faView commit details -
Configuration menu - View commit details
-
Copy full SHA for d22016f - Browse repository at this point
Copy the full SHA d22016fView commit details -
Output Nextclade gene translations to a fasta files
This can be one gene or a set of genes, can then be used to calculate gene_coverage columns.
Configuration menu - View commit details
-
Copy full SHA for ddf0fb3 - Browse repository at this point
Copy the full SHA ddf0fb3View commit details -
Only have final files in the "results" directory
Move intermediate files to the "data" folder
Configuration menu - View commit details
-
Copy full SHA for 1a6d1ef - Browse repository at this point
Copy the full SHA 1a6d1efView commit details -
Add script to calculate gene coverage from Nextclade translated amino…
… acid FASTA file
Configuration menu - View commit details
-
Copy full SHA for 35bbb83 - Browse repository at this point
Copy the full SHA 35bbb83View commit details -
Adds the following rules for gene coverage * calculate_gene_coverage: calls a python script which takes a Nextclade CDS translation FASTA and calculates (valid AA)/(total length). The percentage is rounded to 3 significant figures. * aggregate_gene_coverage_by_gene: combines the gene_coverage files by gene (e.g. ["E", "NS1"] ) across all serotypes (e.g. denv1-4) * appends_gene_coverage_columns: Add each gene_coverage column (e.g. "E_coverage", "NS1_coverage") to the the final metadata
Configuration menu - View commit details
-
Copy full SHA for e0d2a77 - Browse repository at this point
Copy the full SHA e0d2a77View commit details -
Co-authored-by: Jover Lee <joverlee521@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 685e218 - Browse repository at this point
Copy the full SHA 685e218View commit details -
fixup: drop the E_indicator column
#36 (comment) Since we are not using the E_indicator column, drop it. We have separate steps to calculate the E_coverage column.
Configuration menu - View commit details
-
Copy full SHA for c861942 - Browse repository at this point
Copy the full SHA c861942View commit details -
Configuration menu - View commit details
-
Copy full SHA for e722c76 - Browse repository at this point
Copy the full SHA e722c76View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7b94670 - Browse repository at this point
Copy the full SHA 7b94670View commit details -
fixup: move hard-coded columns to a shared workflow variable or confi…
…g params so they don't get out of sync between rules
Configuration menu - View commit details
-
Copy full SHA for 1e7cde8 - Browse repository at this point
Copy the full SHA 1e7cde8View commit details -
Use serotype/gene/files in directory structure
Encode serotype and gene as part of the directory structure where possible.
Configuration menu - View commit details
-
Copy full SHA for f6a620d - Browse repository at this point
Copy the full SHA f6a620dView commit details
Commits on Apr 19, 2024
-
Use a one-to-one mapping of Nextclade input to output columns
As suggested by #36 (comment) Merge ID should be the first item in the map
Configuration menu - View commit details
-
Copy full SHA for db300a5 - Browse repository at this point
Copy the full SHA db300a5View commit details