Skip to content

v1 Build Details

Tiffany J. Callahan edited this page Oct 30, 2023 · 4 revisions

Release: v1.0.0 (pre-release)

All data for this release are available through the following links:

Data Sources

Data Download Date: November 30, 2018 - Data Source Details


Ontologies

Downloaded Resource Information: Ontologies

Classes

Downloaded Resource Information: Classes

Instances

Downloaded Resource Information: Instances



Knowledge Representation

We worked with a PhD-level biologist to develop a knowledge representation (see Figure 1 below) that modeled mechanisms underlying human disease. To do this, we manually mapped all possible combinations of the following six node types:

  1. Humans Diseases
  2. Human Phenotypes
  3. Human Genes
  4. Gene Ontology concepts
  5. Reactome Pathways
  6. Chemicals

As shown in Figure 1, the Basic Formal Ontology and Relation Ontology ontologies were then used to create edges between the node types. The downloaded resource information for generating this information can be accessed here.

Figure 1. Knowledge Representation

As shown in this figure, the following edge-types were created:



Knowledge Graph

The knowledge graph represented above was built using the following steps:

  • Merge Ontologies: Merge ontologies using the OWL Tools API

  • Express New Ontology Concept Annotations: Create new ontology annotations by asserting a relation between the instance and an instance of the ontology class. For example to assert the following relations:

    Morphine --> is substance that treats --> Migraine

    We would need to create two axioms:

    • isSubstanceThatTreats(Morphine, x1)
    • instanceOf(x1, Migraine)

    While the instance of the HP class hemiplegic migraines can be treated as an anonymous node in the knowledge graph, we generate a new international resource identifier for each newly generated instance.

  • Deductively Close Knowledge Graph: The knowledge graph is deductively closed by using the OWL 2 EL reasoner, ELK via Protégé v5.1.1. ELK is able to classify instances and supports inferences over class hierarchies and object properties. inference over disjointness, intersection, and existential quantification (ontology class hierarchies).

  • Generate Edge List: The final step before exporting the edge list is to remove any nodes that are not biologically meaningful or would otherwise reduce the performance of machine learning algorithms and the algorithm used to generate embeddings.



Molecular Mechanism Embeddings

A modified version of the DeepWalk algorithm was implemented to generate molecular mechanism embeddings from the biomedical knowledge graph. A t-SNE plot of the dimensionality reduced mechanism embeddings is shown in Figure 2 below. For this release, the hyperparameters were set to 512 dimensions, 100 walks, walk length of 20, and a window of 10.

Figure 2. t-SNE Plot of Molecular Mechanisms



Generated Output





This project is licensed under Apache License 2.0 - see the LICENSE.md file for details. If you intend to use any of the information on this Wiki, please provide the appropriate attribution by citing this repository:

@software{callahan_tj_2019_3830982,
  author       = {Callahan, TJ and
                  William A. Baumgartner Jr and
                  Ignacio J. Tripodi and
                  Adrianne L. Stefanski and
                  Jordan M. Wyrwa and
                  Lawrence Hunter},
  title        = {PheKnowLator},
  month        = mar,
  year         = 2019,
  note         = {{Newer version of the v1.0.0 release that includes output data generated by this code.}},
  publisher    = {Zenodo},
  version      = {v.1.0.0},
  doi          = {10.5281/zenodo.3830982},
  url          = {https://doi.org/10.5281/zenodo.3830982}
}
Clone this wiki locally