-
Notifications
You must be signed in to change notification settings - Fork 100
GADO Command line
The command line version of GADO can be used to prioritize genes based on the HPO terms of a patient. It yields the same results as our online version available at: http://genenetwork.nl/gado
When using our method please cite: https://www.nature.com/articles/s41467-019-10649-4/
We recommend using this version:
- Prediction matrix (Extract before use)
- Prediction info file
- Genes
- HPO ontology file
Other versions are available here: https://github.com/molgenis/systemsgenetics/wiki/GADO-Command-line-datasets
The input file with HPO terms per case is tab separated. The first column is the sample ID and the subsequent columns contain the HPO terms per case. The number of columns can be different per case if the number of supplied HPO terms is different.
Example:
case1 HP:0001644 HP:0004764
case2 HP:0001644
case3 HP:0001644 HP:0001882 HP:0002037 HP:0031123 HP:0001987
First the HPO terms per case are checked. For terms for which the GeneNetwork predictions are not reliable alternative parent terms are suggested. For details of this process please see: https://www.biorxiv.org/content/10.1101/375766v4
This step will create a file with 6 columns:
Column | Description |
---|---|
Sample | The case ID, if there are multiple HPO terms they are now distributed over multiple lines |
SelectedHpo | The HPO term to be used in the prioritization |
SelectedHpoDescription | Description of the HPO term |
OriginalHpo | If an alternative HPO term was selected, then here the original term is listed |
OriginalHpoDescription | Description of the original HPO term |
ExcludeFromPrioritisation | This column is by default empty. It can be manually set to non empty to ignore a term. |
Example command:
java -jar GADO.jar \
--mode PROCESS \
--output hpoProcessed.txt \
--caseHpo hpo.txt \
--hpoOntology hp.obo \
--hpoPredictionsInfo hpo_predictions_info_01_02_2018.txt
The prioritization step uses the output file of the process HPO terms step. It will rank all genes in GeneNetwork using the selected HPO terms based on the prioritization Z-scores of these HPO terms. It is generally save to simply use all the suggested alternatives and in these cases this second step can be run directly on the output of the process step.
Example command:
java -jar GADO.jar \
--mode PRIORITIZE \
--output ./result/ \
--caseHpoProcessed hpoProcessed.txt \
--genes hpo_predictions_genes_01_02_2018.txt \
--hpoPredictions hpo_predictions_sigOnly_spiked_01_02_2018
The final output is a single file per case with the ranking of all the genes in the prediction matrix. These results can be used to rank the genes that harbor candidate variants of a cases.
Column | Description |
---|---|
Ensg | Ensembl gene ID |
Hgnc | Gene symbol |
Rank | The overall rank of the gene |
Zscore | The combined prioritization Z-score over the supplied HPO terms for this case. This score is used for the ranking |
HP:###### | 1 or multiple columns with the prioritization Z-scores for each of the HPO terms supplied for this case |
Argument | Short | Description |
---|---|---|
--mode | -m | One of the following modes: * PROCESS - Process the HPO terms of cases. Suggests parent terms if needed. * PRIORITIZE - Uses output of PROCESS to prioritize genes |
--caseHpo | -ch | HPO terms per case. Single line per case. First col is case ID, followed by tab separated HPO terms |
--caseHpoProcessed | -chp | The output of mode PROCESS. Type x in the last column to exclude a term. |
--output | -o | The output path. For mode PRIORITIZE supply a folder for the output files per case. |
--genes | -g | File with gene info. col1: geneName (ensg) col2: HGNC symbol |
--hpoOntology | -ho | HPO ontology file, .obo file |
--hpoPredictions | -hp | HPO prediction matrix in binary format (without .dat) |
--hpoPredictionsInfo | -hpi | HPO predictions info |
The source code of GADO commandline can be found here: https://github.com/molgenis/systemsgenetics/tree/master/GadoCommandline
The GeneNetwork code to make a prediction matrix can be found here: https://github.com/molgenis/systemsgenetics/tree/master/GeneNetworkBackend
- QTL mapping pipeline
- Genotype Harmonizer
- Genotype IO
- ASE
- GADO Command line
- Downstreamer
- GeneNetwork Analysis
Analysis plans
Other