Large scale genotype- and phenotype-driven machine learning in Von Hippel-Lindau disease

Andreea Chiorean, Kirsten M Farncombe, Sean Delong, Veronica Andric, Safa Ansar, Clarissa Chan, Kaitlin Clark, Arpad M Danos, Yizhuo Gao, Rachel H Giles, Anna Goldenberg, Payal Jani, Kilannin Krysiak, Lynzey Kujan, Samantha Macpherson, Eamonn R Maher, Liam G McCoy, Yasser Salama, Jason Saliba, Lana ShetaMalachi Griffith, Obi L Griffith, Lauren Erdman, Arun Ramani, Raymond H Kim

Research output: Contribution to journalArticlepeer-review


Von Hippel-Lindau (VHL) disease is a hereditary cancer syndrome where individuals are predisposed to tumor development in the brain, adrenal gland, kidney, and other organs. It is caused by pathogenic variants in the VHL tumor suppressor gene. Standardized disease information has been difficult to collect due to the rarity and diversity of VHL patients. Over 4100 unique articles published until October 2019 were screened for germline genotype-phenotype data. Patient data were translated into standardized descriptions using Human Genome Variation Society gene variant nomenclature and Human Phenotype Ontology terms and has been manually curated into an open-access knowledgebase called Clinical Interpretation of Variants in Cancer. In total, 634 unique VHL variants, 2882 patients, and 1991 families from 427 papers were captured. We identified relationship trends between phenotype and genotype data using classic statistical methods and spectral clustering unsupervised learning. Our analyses reveal earlier onset of pheochromocytoma/paraganglioma and retinal angiomas, phenotype co-occurrences and genotype-phenotype correlations including hotspots. It confirms existing VHL associations and can be used to identify new patterns and associations in VHL disease. Our database serves as an aggregate knowledge translation tool to facilitate sharing information about the pathogenicity of VHL variants.

Original languageEnglish
Pages (from-to)1268-1285
Number of pages18
JournalHuman mutation
Issue number9
Early online date27 Apr 2022
Publication statusPublished - Sept 2022


  • Adrenal Gland Neoplasms/diagnosis
  • Genotype
  • Humans
  • Machine Learning
  • Phenotype
  • Von Hippel-Lindau Tumor Suppressor Protein/genetics
  • von Hippel-Lindau Disease/complications


Dive into the research topics of 'Large scale genotype- and phenotype-driven machine learning in Von Hippel-Lindau disease'. Together they form a unique fingerprint.

Cite this