In a culmination of a multiyear project to identify the chemical modifications of DNA and its associated proteins that regulate gene expression, members of the Roadmap Epigenome Consortium today (February 18) published their analysis of 111 different human epigenomes in Nature. The National Institutes of Health (NIH)-funded team’s analysis—comparing histone and DNA methylation, DNA availability, and other marks such as histone acetylation among the 111 genomes as well as 16 previously released annotated epigenomes from the Encyclopedia of DNA Elements (ENCODE) project—was accompanied by articles examining the patterns of chemical markers and chromatin structure in stem cells, Alzheimer’s disease, and cancer.
“It is definitely a milestone,” said Kristian Helin of the University of Copenhagen, who was not involved in the research. “It should mostly be credited for the enormous amount of work . . . that hopefully will serve as a very good guide for epigenome studies in the future.”
“The human epigenome is this collection of . . . chemical modifications on the DNA itself and on the packaging that holds DNA together,” explained study coauthor Manolis Kellis of MIT during a press conference. “All our cells have a copy of the same book, but they’re all reading different chapters, bookmarking different pages, and highlighting different paragraphs and words.” These chemical bookmarks, such as methylation and acetylation, help control which genes are transcribed into RNA and expressed in a given cell type, thus aiding the maintenance of a particular cell’s identity.
Although members of the ENCODE project had already performed some epigenome mapping as part of their effort to annotate the human genome, the Roadmap group has expanded this work into previously uncharted tissues in organs such as the brain and the heart. Much of the data collected—from more than 2,800 experiments that examined 150 billion genome fragments—was already publicly available; now, the analyzed and annotated versions of the data will also be accessible through an online database.
“At times, we need a control or reference, a baseline, and now we can just go here and download this data, and use that as a baseline for our experiments, and that’s important,” said Manel Esteller of the Bellvitge Biomedical Research Institute in Spain who was not involved in the study.
Each of the Roadmap reference epigenomes annotated the placement of a core set of marks associated with particular functions, such as modifications indicative of gene promoter regions, actively expressed genes, repressed genes, and inactive heterochromatin regions. Many epigenomes also contain additional information, such as RNA transcript sequences, to highlight genes that are actively expressed in a given cell type.
By comparing the annotated epigenomes, the researchers were able to begin to “compare different tissues and cell types to each other at the molecular level and to understand what makes them different,” said Kellis. This allowed them to determine, for example, that the epigenomes of cells derived from embryonic stem cells more closely resembled those of their cells of origin than those of the mature tissues they became. Additionally, the epigenetic marks in the gene enhancer regions of the neurons of Alzheimer’s patients more closely matched immune system enhancer patterns than typical neural patterns.
While the NIH-led Roadmap group does not plan to produce more epigenomic data, the studies published today represent the first major contribution to the International Human Epigenome Consortium’s eventual goal of generating 1,000 reference genomes. “They will continue to add new cell types, new marks, and deposit that type of data in public databases,” said study coauthor Lisa Chadwick of NIH’s National Institute for Environmental Health Sciences. Already, Esteller said he looks forward to updated reference epigenomes with annotations for new marks and correlations with non-coding RNA, while Helin noted that technological advances toward single-cell epigenome mapping could provide a greater degree of precision.
In the meantime, Kellis and his colleagues have unlocked the predictive power of their data set by applying it to related unmapped tissues. “As new marks are profiled, their correlation structure with existing marks, and the relationship between closely related cell types to each other, will allow us to actually predict the missing marks,” Kellis told reporters. “There’s a phase transition that happens when you have such a large number of data sets that are already mapped.”