Skip to content

Commit

Permalink
identify unique genes
Browse files Browse the repository at this point in the history
  • Loading branch information
sophie22 committed May 7, 2022
1 parent d453684 commit 613aec6
Showing 1 changed file with 6 additions and 0 deletions.
6 changes: 6 additions & 0 deletions genes_coverage.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,9 @@
sambamba_df["ExonLength"] = sambamba_df["EndPosition"] - sambamba_df["StartPosition"]
# Calculate number of bases above 30x coverage
sambamba_df["AboveThreshold"] = sambamba_df[coverage_column] / 100 * sambamba_df["ExonLength"]

# Identify unique genes
panel_genes = sambamba_df["GeneSymbol;Accession"].unique().tolist()
# Split 'GeneSymbol;Accession' into separate columns
sambamba_df[["GeneSymbol", "Accession"]] = sambamba_df[
"GeneSymbol;Accession"].str.split(';', 1, expand=True)

0 comments on commit 613aec6

Please sign in to comment.