Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

High resolution in situ hybridization (ISH) images of the brain capture spatial gene expression at cellular resolution. These spatial profiles are key to understanding brain organization at the molecular level. Previously, manual qualitative scoring and informatics pipelines have been applied to ISH images to determine expression intensity and pattern. To better capture the complex patterns of gene expression in the human cerebral cortex, we applied a machine learning approach. We propose gene re-identification as a contrastive learning task to compute representations of ISH images. We train our model on an ISH dataset of ~1,000 genes obtained from postmortem samples from 42 individuals. This model reaches a gene re-identification rate of 38.3%, a 13x improvement over random chance. We find that the learned embeddings predict expression intensity and pattern. To test generalization, we generated embeddings in a second dataset that assayed the expression of 78 genes in 53 individuals. In this set of images, 60.2% of genes are re-identified, suggesting the model is robust. Importantly, this dataset assayed expression in individuals diagnosed with schizophrenia. Gene and donor-specific embeddings from the model predict schizophrenia diagnosis at levels similar to that reached with demographic information. Mutations in the most discriminative gene, Sodium Voltage-Gated Channel Beta Subunit 4 (SCN4B), may help understand cardiovascular associations with schizophrenia and its treatment. We have publicly released our source code, embeddings, and models to spur further application to spatial transcriptomics. In summary, we propose and evaluate gene re-identification as a machine learning task to represent ISH gene expression images.

Original publication




Journal article


PLoS One

Publication Date