Increasing the usability of big data for Alzheimer's research
On October 11, 2016, the first manuscript describing a treasure trove of genomic data contributed by members of the Accelerating Medicines Partnership for Alzheimer’s Disease (AMP AD) Target Discovery and Preclinical Validation Consortium was published in Nature Scientific Data. The publication of the datasets and their description are part of an NIH-wide effort to bring together big data and experts from diverse disciplines to better understand dementia, as well as other chronic conditions.
The datasets, made available via the AMP AD Knowledge Portal, include whole-genome genotype and gene expression patterns on 2,655 individuals, including people with and without dementia. The data include more than 842 million data points and clinical information. The Nature Scientific Data report provides a detailed description of the subjects, samples, data generation, and quality control, as well as instructions on how to access the datasets. This will enable researchers to use the data for replication research or for new analyses to study disease mechanisms and to discover new therapeutic targets for AD and related dementias.
The Consortium members involved in this data distribution include the Mayo Clinic in Jacksonville, the University of Florida, the Institute for Systems Biology, and Sage Bionetworks; the study’s lead researchers are Drs. Mariet Allen and Minerva M. Carrasquillo, both at the Mayo Clinic’s Florida campus.
Why are data descriptor manuscripts important?
Data descriptors are a fairly a new type of publication. They provide detailed descriptions of research datasets, including the methods used to collect the data and details on the quality control. They promote broad reuse of valuable datasets and provide attribution to the researchers who designed the study and generated the data.
A unique collaborative effort
The AMP AD Target Discovery and Preclinical Validation Consortium is a large-scale team science effort that brings together six multi-institutional, multidisciplinary teams within a precompetitive, public-private partnership, research framework. The teams are applying cutting-edge systems and network biology approaches to integrate multidimensional human “omic” (genomic, proteomic, and metabolomic) data from more than 2,000 human brains at all stages of the disease with clinical and pathological data. These efforts are paired with experimental validation studies in a variety of cell-based and animal models.
Various types of omics datasets from human brain and blood samples, as well as data from cell-based and animal models, are being released through the AMP AD Knowledge Portal and made available to all qualified researchers. No publication embargo is imposed on the reuse of data after they have been made available through the Portal.
If you’re involved in research on genetics or systems biology or if you’re studying disease mechanisms of AD and related dementias, I encourage you to go to the Knowledge Portal and see which of the many datasets and analytical results may be useful for your studies or to find out how you can contribute data and analyses. Let us know what you think by commenting below.