-
Notifications
You must be signed in to change notification settings - Fork 29
Building a KG of Human Disease Molecular Mechanisms
A deeper understanding of the molecular drivers of disease, within the context of clinical phenotypes, is currently limited by our ability to integration and analyze clinical and molecular knowledge at scale. One approach to overcome this limitation is to utilize Semantic Web technology to resolve knowledge integration problems between clinical concepts and biomedical knowledge. The Semantic Web requires compliance to standards set forth by the World Wide Web Consortium. The technologies and languages utilized by the Semantic Web facilitate the integration of heterogeneous data using explicit semantics. For example, knowledge graph (KG)s, which are constructed from biological ontologies, allow for the representation of nodes and edges are modeled (e.g. TBX21--has phenotype--nasal polyps), meaningfully integrate heterogeneous biological data to facilitate the understanding of complex biological mechanisms. When applied to knowledge graphs, deep learning has great potential to identify and model mechanisms underlying complex disease.
Using the biomedical knowledge sources described above, we worked with a PhD-level biologist to develop a knowledge representation that modeled mechanisms underlying human disease. To do this, we manually mapped all possible combinations of the six node types (i.e. disease, phenotypes, genes, GO concepts, pathways, and chemicals) to create the the knowledge graph edges.
As shown above, the Basic Formal Ontology and Relation Ontology ontologies were used when creating these edges. The resulting knowledge representation was then used to generate a knowledge graph using the following steps:
- Merge Ontologies: Merge ontologies using the OWL Tools API
- Express New Ontology Concept Annotations: Create new ontology annotations by asserting a relation between the instance and an instance of the ontology class. For example, to assert that instance morphine (MESH:D009020) is substance that treats (RO:0002606) the HP class hemiplegic migraines (HP:0002076), we would need to create two axioms: isSubstanceThatTreats(morphine, x1) and _instanceOf(x1, hemiplegic migraine). While the instance of the HP class hemiplegic migraines can be treated as an anonymous node in the knowledge graph, we generate a new international resource identifier for each newly generated instance.
- Deductively Close Knowledge Graph: The knowledge graph is deductively closed by using the OWL 2 EL reasoner, ELK via Protégé v5.1.1). ELK is able to classify instances and supports inferences over class hierarchies and object properties. inference over disjointness, intersection, and existential quantification (ontology class hierarchies).
- Generate Edge List: The final step before exporting the edge list is to remove any nodes that are not biologically meaningful or would otherwise reduce the performance of machine learning algorithms and the algorithm used to generate embeddings.