You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You are a new bioinformatics programmer 🤓 in the Genomics Division at BioResLabs INC 🏢. Your role is to study the link between mutations and cancer 🧬. The senior bioinformatician needs your help analyzing mutations in breast cancer tumors 💻. She asks you to **write a python script** called `utils.py` which contains functions needed for the analysis.
45
45
46
+
**Note**: Your code must not rely on any packages except for [Numpy](https://numpy.org/), [Pandas](https://pandas.pydata.org/), and their dependencies.
47
+
46
48
_The following tasks describe the functions that should be included in `utils.py`._
47
49
48
50
### Task 1: A Universal Gene ID Converter
@@ -123,7 +125,7 @@ The senior bioinformatician has hypothesized that [single nucleotide variants (S
123
125
2.**Arguments**:
124
126
-`cancer`: A string containing the tumor DNA sequence
125
127
-`normal`: A string containing the normal tissue DNA sequence
126
-
3.**Returns**: A `DataFrame` which contains the following columns:
128
+
3.**Returns**: A Pandas `DataFrame` which contains the following columns:
127
129
-`position`: gives the position of an alteration within the input sequence
128
130
-`cancer`: gives the cancer base at that position
129
131
-`normal`: gives the normal base at that position
@@ -323,7 +325,7 @@ Thus far, you have built functions to identify variants and convert between DNA,
323
325
2.**Arguments**:
324
326
-`cancer`: A string containing the tumor DNA sequence
325
327
-`normal`: A string containing the normal tissue DNA sequence
326
-
3.**Returns**: A `DataFrame` which contains the following columns:
328
+
3.**Returns**: A Pandas `DataFrame` which contains the following columns:
327
329
-`codon_number`: gives the position of an altered codon within the input sequence
328
330
-`cancer`: gives the cancer amino acid at that position
329
331
-`normal`: gives the normal amino acid at that position
@@ -396,11 +398,11 @@ In some cases, SNVs can lead to a premature STOP codon. This is called a ["nonse
396
398
397
399
1.**Name**: Needs to be a function called `find_nonsense()`
398
400
2.**Arguments**:
399
-
-`sequences`: a `DataFrame` containing three columns:
401
+
-`sequences`: a Pandas `DataFrame` containing three columns:
400
402
-`gene_id`: The ID of the gene (can be either Ensembl or Entrez)
401
403
-`cancer`: The sequence of the gene in the cancer sample
402
404
-`normal`: The sequence of the gene in the normal sample
403
-
3.**Returns**: a `DataFrame` with one entry per nonsense mutation, containing the following columns:
405
+
3.**Returns**: a Pandas `DataFrame` with one entry per nonsense mutation, containing the following columns:
404
406
-`gene_id`: The gene ID originally provided by the user for this gene
405
407
-`gene_symbol`: The symbol of the supplied gene
406
408
-`codon_number`: gives the position of an altered codon within the input sequence
@@ -421,6 +423,8 @@ In some cases, SNVs can lead to a premature STOP codon. This is called a ["nonse
1. Your code must not depend on any packages outside of base python v3.10.4
488
+
1. Your code must not depend on any packages outside of base python v3.10.4, Pandas, and Numpy.
483
489
2. To test your code locally, run `pytest` from the command line
484
490
3. To lint your code locally, run `flake8 .` from the command line
485
491
4. If you are feeling uncomfortable working with the BRN Skill Assessment platform, please consider going back to the python-based tutorial and completing it. If you are still getting stuck, please check the [Getting help](#getting-help) section.
0 commit comments