Over the past few years, single-cell genomics has revolutionized biology and medicine to a point where knowledge on some biological systems and disease had to be consequently replaced. Classical methods that measure RNA, protein, DNA and other cell components in populations of cells generate results that are the average of what is found in many different cells, masking the effects that occur at the individual cell level that are often crucial. Conversely, single-cell (sc) technologies capture these differences. The development of multimodal methods that couple leading-edge procedures of genomics, epigenomics and proteomics at the sc level is the essence of the 37TrillionCells Initiative.
The 37TrillionCells Initiative is an ambitious, innovative and far-reaching research project where scientists from various institutions collaborate for charting human gene expression programs at single-cell resolution to better understand the biology of disease and develop more efficient treatments, with a focus, in addition to normal developmental processes, on cancer, neurological/neurodegenerative disorders, cardiometabolic diseases, and COVID-19 immunological conditions. 37TrillionCells integrates multiomics, high-end imaging and machine learning to decipher longitudinally gene expression programs at single-cell resolution in order to identify all cell types in specific tissues (or tissue models such as organoids and iPSC) and track the developmental trajectory of cells from health to disease. The gathered information will serve to develop data-driven models to detect the onset of disease, prior to tissue damage and the start of symptoms. Early disease detection has the enormous advantage of allowing initiation of treatment before the damage caused by the disease is too important. This strategy named "cell-based interceptive medicine" (Rajewsky et al 2020 Nature) promises to improve healthcare in the coming decade.
The main objectives of 37TrillionCells are fivefold.
Firstly, we are implementing a multi-disciplinary, multi-site research consortium that will operate, according to commonly adopted standard operating protocols (SOP), leading-edge technologies for scDNA-seq (genome), scRNA-seq (transcriptome), scATAC-seq (epigenome), scProteomics (proteome) and multimodal methods such as scMulti-omics (epigenome and transcriptome) and CITE-seq (proteome and transcriptome) in order to longitudinally chart gene expression programs in numerous cells from health to disease. Each site of the consortium will assume leadership for the implementation (SOP), transfer and leverage of one or a set of selected procedures.
Secondly, we are charting human gene expression programs longitudinally at single-cell resolution in various biological systems, including normal developmental processes, cancer, neurological/neurodegenerative disorders, cardiometabolic diseases, and COVID-19. To do so, the 37TrillionCells Initiative applies strict ethical rules to recruit subjects to participate in various specific studies supervised by the various clinics in participating institutions (for example the post-COVID clinic at the IRCM, the oncology and neurology clinics at the CRCHUM and CHU de Québec, and others).
Thirdly, we are developing novel sc technologies to map more finely gene expression programs, particularly scPHD-seq (proteome and transcriptome) and scPHAGE-ATAC (proteome and epigenome) that harness the power of phage display to improve profiling of the proteome in conjunction with scRNA-seq or scATAC. scProteomics using mass spectrometry is also being developed. This part is important because proteins are more often directly involved in the establishment of disease phenotypes and their level do not always correlate with mRNA expression.
Fourthly, we are advancing the development of integrative computational methods, including the implementation of machine learning/artificial intelligence tools, to map more accurately and comprehensively gene programs.
Fifthly, we are enforcing a novel mode of partnership between basic scientists, clinicians and industrial partners, based on an open-science modus operandi (see below).
The COVID-19 pandemic has taught us that international collaboration and the adoption of an open science modus operandi accelerate research. The infrastructure we are developing for 37TrillionCells favours the implementation of an open science ecosystem where results, protocols and reagents are freely and rapidly shared among laboratories. We trust that collaborations and partnerships are essential because surveying 37,2 trillion cells in health and disease cannot be achieved by one group, one institution, or even one country alone.
The cell is the fundamental unit of life. Understanding human biology and disease requires a comprehensive analysis of the gene expression programs of the 37,2 trillion of cells that make up the human body. Developing more efficient diagnostics and therapeutics requires that we can track trajectories of individual cells (gene programs and physical locations) from health to disease. The 37TrillionCells Initiative aims at contributing essential data to meet these important challenges.