Merge pull request #1 from bedapub/feature/add-tyrosine-support

Feature/add tyrosine support
bedapub · Oct 22, 2024 · 2642eed · 2642eed
2 parents 234f2a5 + 1efba41
commit 2642eed
Show file tree

Hide file tree

Showing 30 changed files with 1,202 additions and 569 deletions.
diff --git a/README.md b/README.md
@@ -1,15 +1,8 @@
-# Kinex
+# Kinex - Kinome Exploration Tool
 
-Alexandra Valeanu and Jitao David Zhang with the input and help of many colleagues
+**Kinex** is a Python package for infering causal kinases from phosphoproteomics data.
 
-**kinex** is a workflow implemented in a Python package with the same name. Kinex infers causal kinases from phosphoproteomics data.
-
-## Table of Contents
-
-- [Main Features](#main-features)
-- [Requirements](#requirements)
-- [Installation](#installation)
-- [Documentation](#documentation)
+Paper: Kinex infers causal kinases from phosphoproteomics data. https://doi.org/10.1101/2023.11.23.568445
 
 ## Main Features
 
@@ -24,47 +17,61 @@ Alexandra Valeanu and Jitao David Zhang with the input and help of many colleagu
 
 ## Installation
 
-Here are few ways to install **kinex**
-
 ### From Conda
 
-1. Create and activate your conda environment
-
 ```
+# Create and activate your conda environment
 conda create --name kinex
 conda activate kinex
-```
-
-2. Install kinex package
 
-```
+# Install kinex package
 conda install -c bioconda kinex
 ```
 
-
 ### From source
 
-1. Create and activate a python 3.11 conda environment 
-
 ```
+# Create and activate a python 3.11 conda environment 
 conda create --name kinex
 conda activate kinex
 conda install python=3.11
-```
-
-2. Download the package:
 
-```
+# Download the package:
 git clone git@github.com:bedapub/kinex.git
 cd kinex
+
+# Install the package
+pip install .
 ```
 
-3. Install the package
+## Quick start
 
+1. Import package and create Kinex object
 ```
-pip install .
+from kinex import Kinex
+import pandas as pd
+
+# Read scoring matrices from zenodo
+scoring_matrix_ser_thr = pd.read_csv("https://zenodo.org/records/13964893/files/scoring_matrix_ser_thr_82k_sorted.csv.gz?download=1", compression="gzip")
+scoring_matrix_tyr = pd.read_csv("https://zenodo.org/records/13964893/files/scoring_matrix_tyr_7k_sorted.csv.gz?download=1", compression="gzip")
+
+# Create Kinex object
+kinex = Kinex(scoring_matrix_ser_thr, scoring_matrix_tyr)
+```
+2. Score a sequence
+```
+sequence = "FVKQKAY*QSPQKQ"
+res = kinex.get_score(sequence)
+```
+
+3. Enrichment analysis
+```
+enrich = kinex.get_enrichment(input_sites, fc_threshold=1.5, phospho_priming=False, favorability=True, method="max")
+
+enrich.ser_thr.plot()
+enrich.tyr.plot()
 ```
 
-## [Documentation](https://kinex.readthedocs.io/en/latest/)
+## Documentation
 
-You can find detailed [documentation](https://kinex.readthedocs.io/en/latest/) describing every feature of the package with examples and tutorials [here](https://kinex.readthedocs.io/en/latest/).
+You can find detailed documentation describing every feature of the package with examples and tutorials [here](https://kinex.readthedocs.io/en/latest/).
diff --git a/docs/chapters/features/comparison.rst b/docs/chapters/features/comparison.rst
@@ -5,13 +5,13 @@ Drug comparison
 
 .. code:: python
 
-    >>> from kinex import Comparison
+    from kinex import Comparison
 
 2. Initialize a Comparison object
 
 .. code:: python
 
-    >>> comp = Comparison()
+    comp = Comparison()
 
 
 Compare multiple experiments with each other
@@ -21,7 +21,7 @@ Compare multiple experiments with each other
 
 .. code:: python
 
-    >>> data_path = "path/to/your/tables"
+    data_path = "path/to/your/tables"
 
 
 .. note:: 
@@ -40,7 +40,7 @@ Compare multiple experiments with each other
 
 .. code:: python
 
-    >>> fig = comp.get_comparison(data_path=data_path, method='mds')
+    fig = comp.get_comparison(data_path=data_path, method='mds')
 
 .. note:: 
 
@@ -50,7 +50,7 @@ Compare multiple experiments with each other
 
 .. code:: python
 
-    >>> fig.show()
+    fig.show()
 
 .. raw:: html
     :file: ../../figures/comparison_multiple_drugs.html
@@ -72,7 +72,7 @@ Compare an experiment to the existing collection of drug profiles
 
 .. code:: python
 
-    >>> input_data = pd.read_csv('tables/table1.csv', index_col=0)
+    input_data = pd.read_csv('tables/table1.csv', index_col=0)
 
 .. note::
 
@@ -83,13 +83,7 @@ Compare an experiment to the existing collection of drug profiles
         dominant_enrichment_value_log2  dominant_p_value_log10_abs  
                              0.868162                    0.821932  
                             -0.785398                    0.707911  
-                            -0.934463                    0.901927  
-                            -1.369094                    0.000000  
-                            -1.474303                    0.000000  
-                                ...                         ...  
-                            -2.914661                    2.022525  
-                            -2.490535                    1.691968  
-                            -2.920072                    0.000000  
+                                ...                         ...    
                             -1.551978                    0.795959  
                             -2.986266                    1.521982  
 
@@ -99,7 +93,7 @@ Compare an experiment to the existing collection of drug profiles
 
 .. code:: python
 
-    >>> fig = comp.get_comparison(input_data=input_data, method='tsne')
+    fig = comp.get_comparison(input_data=input_data, method='tsne')
 
 .. note:: 
 
@@ -118,7 +112,7 @@ Compare an experiment to the existing collection of drug profiles
 
 .. code:: python
 
-    >>> fig.show()
+    fig.show()
 
 .. raw:: html
     :file: ../../figures/comparison_input.html
@@ -131,28 +125,28 @@ Save the plot in a desired format
 
 .. code:: python
     
-    >>> fig.write_html("path/to/file.html")
+    fig.write_html("path/to/file.html")
 
 - ``.svg``
 
 .. code:: python
 
-    >>> fig.write_image("images/fig1.svg")
+    fig.write_image("images/fig1.svg")
 
 - ``.pdf``
 
 .. code:: python
 
-    >>> fig.write_image("images/fig1.pdf")
+    fig.write_image("images/fig1.pdf")
 
 - ``.png``
 
 .. code:: python
 
-    >>> fig.write_image("images/fig1.png")
+    fig.write_image("images/fig1.png")
 
 - ``.jpeg``
 
 .. code:: python
 
-    >>> fig.write_image("images/fig1.jpeg")
+    fig.write_image("images/fig1.jpeg")
diff --git a/docs/chapters/features/enrichment.rst b/docs/chapters/features/enrichment.rst
@@ -9,8 +9,11 @@ Kinases inference analysis
 
 .. code:: python
 
-    >>> input_sites = pd.read_csv('path/to/your/input_sites.csv')
-    >>> input_sites
+    input_sites = pd.read_csv('path/to/your/input_sites.csv')
+    input_sites
+
+.. code-block:: text
+
                   Sequence  Fold Change: a/a' KO Clone A vs WT
     0     KLEEKQKs*DAEEDGV                          -88.159789
     1     EEDGVTGs*QDEEDSK                          -88.159789
@@ -34,55 +37,35 @@ Kinases inference analysis
 
 .. code:: python
 
-    >>> enrich = kinex.get_enrichment(input_sites, fc_threshold=1.5, phospho_priming=False, favorability=True, method="max")
-    >>> enrich
-    Total number of upregulated phospho-sequences is: 63
-    Total number of downregulated phospho-sequences is: 86
-    Total number of unregulated phospho-sequences is: 309
-    enrichment.Enrichment
+    enrich = kinex.get_enrichment(input_sites, fc_threshold=1.5, phospho_priming=False, favorability=True, method="max")
 
 3. Access the total number of up-regulated, down-regulated, and un-regulated phospho-sequences
 
 .. code:: python
 
-    >>> enrich.total_upregulated
-    63
-    int
-    >>> enrich.total_downregulated
-    86
-    int
-    >>> enrich.total_unregulated
-    309
-    int
+    print("Total upregulated Ser/Thr kinases:", enrich.ser_thr.total_upregulated)
+    print("Total downregulated Ser/Thr kinases:", enrich.ser_thr.total_downregulated)
+    print("Total unregulated Ser/Thr kinases:", enrich.ser_thr.total_unregulated)
 
-4. Check the sites that were marked as failed
+.. code-block:: text
 
-.. code:: python
+    Total upregulated Ser/Thr kinases: 63
+    Total downregulated Ser/Thr kinases: 86
+    Total unregulated Ser/Thr kinases: 309
 
-    >>> enrich.failed_sites
-    ['EKIGEGTyGVVYKGR', 'KPSIVTKyVESDDEK', 'LGQRIYQyIQSRFYR', 'INPGYDDyADSDEDQ', 'ADNDITPyLVSRFYR', 'RGEPNVSyICSRYYR']
-    list
-
-5. Check the regulation of each phospho-sequence, and get the top 15 kinases most likely to target each phospho-sequence
+4. Check the sites that were marked as failed
 
 .. code:: python
 
-    >>> enrich.input_sites
-                  Sequence  Fold Change: a/a' KO Clone A vs WT     regulation top15_kinases
-    0     KLEEKQKs*DAEEDGV                          -88.159789  downregulated GRK7,IKKA,CAMK2B,CK2A1,CK2A2,GRK6,LATS2,GRK1,C... 
-    1     EEDGVTGs*QDEEDSK                          -88.159789  downregulated DNAPK,CAMK2G,ATM,ATR,GRK5,GRK1,SMG1,CAMK2B,GRK... 
-    ..                 ...                                 ...            ...   
-    462   AKEESEEs*DEDMGFG                           19.421218    upregulated BMPR1A,TGFBR1,BMPR1B,ALK2,CK1G2,CK2A2,ACVR2A,G...   
-    463   RNGPRDAs*PPGSEPE                           63.187703    upregulated SRPK2,SRPK1,SRPK3,HIPK4,CLK2,CLK3,HIPK2,KIS,GR... 
+    enrich.failed_sites
 
-    [464 rows x 4 columns]
-    pandas.DataFrame
-
-6. Show enrichment table
+5. Show enrichment table
 
 .. code:: python
 
-    >>> enrich.enrichment_table
+    enrich.ser_thr.enrichment_table
+
+.. code-block:: text
 
             upregulated  downregulated  ... dominant_enrichment_value_log2 dominant_p_value_log10_abs
     kinase                                                                      
@@ -95,16 +78,16 @@ Kinases inference analysis
     [303 rows x 19 columns]
     pandas.DataFrame
 
-7. Vulcano plot of Enrichment Odds Ratio (EOR) and p-value
+6. Vulcano plot of Enrichment Odds Ratio (EOR) and p-value
 
 .. note::
 
     Kinases are represented with colours corresponding to their class. 
 
 .. code:: python
 
-    >>> fig = enrich.plot(use_adjusted_pval=False)
-    >>> fig.show()
+    fig = enrich.ser_thr.plot(use_adjusted_pval=False)
+    fig.show()
 
 
 .. raw:: html
@@ -117,38 +100,34 @@ Kinases inference analysis
     `https://plotly.com/python/creating-and-updating-figures <https://plotly.com/python/creating-and-updating-figures>`_
 
 
-
-
-
-8. Save the figure in a desired format
-
+7. Save the figure in a desired format
 
 - ``.html``
 
 .. code:: python
     
-    >>> fig.write_html("path/to/file.html")
+    fig.write_html("path/to/file.html")
 
 - ``.svg``
 
 .. code:: python
 
-    >>> fig.write_image("images/fig1.svg")
+    fig.write_image("images/fig1.svg")
 
 - ``.pdf``
 
 .. code:: python
 
-    >>> fig.write_image("images/fig1.pdf")
+    fig.write_image("images/fig1.pdf")
 
 - ``.png``
 
 .. code:: python
 
-    >>> fig.write_image("images/fig1.png", scale=10)
+    fig.write_image("images/fig1.png", scale=10)
 
 - ``.jpeg``
 
 .. code:: python
 
-    >>> fig.write_image("images/fig1.jpeg", scale=10)
+    fig.write_image("images/fig1.jpeg", scale=10)