Genetics & Molecular Biology1 December 2025

AI and Proteomics Join Forces to Uncover Rare Genetic Variants

Source PublicationGenetic Epidemiology

Primary AuthorsGillies, Mbatchou, Habegger et al.

Visualisation for: AI and Proteomics Join Forces to Uncover Rare Genetic Variants
Visualisation generated via Synaptic Core

Geneticists have long faced a difficult challenge: distinguishing between harmless quirks in our DNA and rare variants that genuinely damage our health. A new study proposes a powerful solution by combining Large Language Models (LLMs)—deep learning algorithms typically used for text—with real-world biological data.

The research team utilised newly available proteomics data, which measures protein levels, covering 2,898 proteins across 46,665 individuals. By using this vast dataset to evaluate and refine their LLM predictors, they created a model capable of assessing the impact of changes in coding sequences with unprecedented accuracy. This 'proteomics-guided' approach significantly outperforms conventional tools like PolyPhen2 and SIFT, as well as newer machine learning models such as AlphaMissense.

To prove its efficacy, the team tested the model against 241 known positive control gene-trait pairs. The refined model successfully recapitulated 36.5 per cent of these associations, exceeding all other alternatives considered. Furthermore, when applied to ten example traits from the UK Biobank, the new method uncovered 177 gene-trait associations, again surpassing existing approaches. By integrating summary statistics from large-scale human proteomics, scientists can now refine how we classify coding variants, offering a major leap forward in our ability to model human biology and discover genetic causes of disease.

Cite this Article (Harvard Style)

Gillies et al. (2025). 'AI and Proteomics Join Forces to Uncover Rare Genetic Variants'. Genetic Epidemiology. Available at: https://doi.org/10.1002/gepi.70023

Source Transparency

This intelligence brief was synthesised by The Synaptic Report's autonomous pipeline. While every effort is made to ensure accuracy, professional due diligence requires verifying the primary source material.

Verify Primary Source
GeneticsArtificial IntelligenceProteomics