Graph Neural Networks for Gene-Disease prioritization

Ingrid Heuer; Ariel Chernomoretz

Open Conference Systems, DDAYS LAC 2024 Main Conference

Ingrid Heuer, Ariel Chernomoretz

Building: Cero Infinito
Room: Posters hall
Date: 2024-12-10 04:30 PM – 06:30 PM
Last modified: 2024-11-19

Abstract

Understanding the genetic basis of diseases is a complex challenge, driven by complex biological interactions and vast, heterogeneous data. Graph-based models offer an effective way to represent these relationships, providing valuable insights into gene-disease associations. In this work, we applied Graph Neural Networks (GNNs) to address the problem of gene-disease prioritization.

We curated and integrated a heterogeneous, multilayer graph to model various biological interactions, including protein-protein interactions, complex formation, biological pathways, disease-disease associations, and known gene-disease associations. By conducting a structural analysis of this disease network, combining concepts from network science and natural language processing, we identified informative patterns that facilitate the extraction of critical information, which could help medical professionals navigate ambiguous data more effectively.

Additionally, we developed a model based on Graph Neural Networks to predict gene-disease associations, which was trained using this comprehensive database. To optimize the GNN architecture, we implemented a framework based on the Metropolis-Hastings algorithm, which performs random walks in a high-dimensional parameter space. We found a high-performance model through this approach. We also examined the issue of negative sampling, often overlooked in GNN research, and its impact on both model training and evaluation. Our results emphasize the importance of proper negative sampling to avoid misleading conclusions.