Variational autoencoders learn transferrable representations of metabolomics data.

TitleVariational autoencoders learn transferrable representations of metabolomics data.
Publication TypeJournal Article
Year of Publication2022
AuthorsGomari DP, Schweickart A, Cerchietti L, Paietta E, Fernandez H, Al-Amin H, Suhre K, Krumsiek J
JournalCommun Biol
Volume5
Issue1
Pagination645
Date Published2022 Jun 30
ISSN2399-3642
KeywordsDiabetes Mellitus, Type 2, Humans, Metabolomics, Principal Component Analysis
Abstract

Dimensionality reduction approaches are commonly used for the deconvolution of high-dimensional metabolomics datasets into underlying core metabolic processes. However, current state-of-the-art methods are widely incapable of detecting nonlinearities in metabolomics data. Variational Autoencoders (VAEs) are a deep learning method designed to learn nonlinear latent representations which generalize to unseen data. Here, we trained a VAE on a large-scale metabolomics population cohort of human blood samples consisting of over 4500 individuals. We analyzed the pathway composition of the latent space using a global feature importance score, which demonstrated that latent dimensions represent distinct cellular processes. To demonstrate model generalizability, we generated latent representations of unseen metabolomics datasets on type 2 diabetes, acute myeloid leukemia, and schizophrenia and found significant correlations with clinical patient groups. Notably, the VAE representations showed stronger effects than latent dimensions derived by linear and non-linear principal component analysis. Taken together, we demonstrate that the VAE is a powerful method that learns biologically meaningful, nonlinear, and transferrable latent representations of metabolomics data.

DOI10.1038/s42003-022-03579-3
Alternate JournalCommun Biol
PubMed ID35773471
PubMed Central IDPMC9246987
Grant List / WT_ / Wellcome Trust / United Kingdom
U19 AG063744 / AG / NIA NIH HHS / United States
U24 CA196172 / CA / NCI NIH HHS / United States
/ DH_ / Department of Health / United Kingdom
UG1 CA189859 / CA / NCI NIH HHS / United States
/ MRC_ / Medical Research Council / United Kingdom
U10 CA180820 / CA / NCI NIH HHS / United States