Integrative mapping of the dog epigenome: reference annotation for comparative intertissue and cross-species studies
Keun Hong Son1,2,3†, Mark Borris Aldonza1,2,3†, A-reum Nam1,2,3†, Kang-Hoon Lee1,3, Jeong-Woon Lee1,2,3, Kyung-Ju Shin1,3, Keunsoo Kang4, and Je-Yoel Cho1,2,3*
1 Department of Biochemistry, College of Veterinary Medicine, Seoul National University, Seoul, Korea
2 Comparative Medicine and Disease Research Center (CDRC), Science Research Center (SRC), Seoul National University, Seoul, Korea
3 BK21 PLUS Program for Creative Veterinary Science Research and Research Institute for Veterinary Science, Seoul National University, Seoul, Korea
4 Department of Microbiology, College of Natural Sciences, Dankook University, Cheonan, Korea
† These authors contributed equally to this work as co-first authors: newhong@snu.ac.kr, borris@snu.ac.kr and arbjlvz@snu.ac.kr
* Corresponding author: jeycho@snu.ac.kr
Dogs have become a valuable model in exploring multifaceted diseases and biology relevant to human health. Despite large-scale dog genome projects producing high-quality draft references, a comprehensive annotation of functional elements is still lacking. We addressed this through integrative next-generation sequencing of transcriptomes paired with five histone marks and DNA methylome profiling across 11 tissue types, deciphering the dog’s epigenetic code by defining distinct chromatin states, super-enhancer and methylome landscapes, and thus showed that these regions are associated with a wide range of biological functions and cell/tissue identity. In addition, we confirmed that the phenotype-associated variants are enriched in tissue-specific regulatory regions and, therefore, the tissue of origin of the variants can be traced. Ultimately, we delineated conserved and dynamic epigenomic changes at the tissue- and species-specific resolutions. Our study provides an epigenomic blueprint of the dog that can be used for comparative biology and medical research.
Paper: Science Advances
Data repository: Acession: GSE203107 (Updated: 230711)
Data browser: the UCSC_trackhub (Optimizing due to SSL certificate problem in server)
Transcript contig means the region of the transcript covered by RNA-seq. Therefore, it is possible to study known and novel positions where transcripts are expressed without dependency on genome annotations. To comprehensively profile the transcriptome of genic regions in the dog genome, we performed RNA-seq experiments in 11 dog tissues with two biological replicates each. Strand-specific contig regions were defined using 11 tissues RNA-seq data through the approach described by Djebali et al.
Download: RNA_Contigs
- A comparative encyclopedia of DNA elements in the mouse genome, Nature, 2014
- Landscape of transcription in human cells, Nature, 2012
To deconvolute this transcript abundance, we functionally characterized the inter-tissue coding transcriptomes of the dog. We classified the pool of expressed genes according to their tissue-specificity.
Download: Categorized genes according to tissue specificity
- Tissue-based map of the human proteome, Science, 2015
Based on the expression of 12,551 protein-coding orthologs across human, mouse and dog, we estimated the expression divergence between these species and their matched nine tissues (ENCODE and our dataset).
Download: Conserved gene lists
- A comparative encyclopedia of DNA elements in the mouse genome, Nature, 2014
- Comparative transcriptomics in human and mouse, Nature Reviews Genetics, 2017
To advance the functional annotation of the dog genome, we produced integrated maps of histone modifications-informed, genome-wide 13-chromatin state model in 11 dog tissues. We defined the dog genome as having a core set of five histone H3 modification marks: H3K4me3, H3K4me1, H3K27ac, H3K27me3, and H3K9me3—marks well-known to have specific depositions on particular genomic regions and molecular signal associations (i.e., promoters, enhancers, heterochromatin, Polycomb repressive domains, etc).
Download: Chromatin states canFam3.1
canFam3.1 to others using LiftOver: canFam4, canFam5, canFam6
- ChromHMM: automating chromatin-state discovery and characterization, Nature methods, 2012
- Integrative analysis of 111 reference human epigenomes, Nature, 2015
To further probe tissue identity and function based on H3K27ac signals, a strong indicator of active promoter and enhancer states, we characterized super-enhancers (SEs) landscapes in the dog genome across multiple tissues.
Download: Super enhancer
- Master Transcription Factors and Mediator Establish Super-Enhancers at Key Cell Identity Genes, Cell, 2013
- Selective Inhibition of Tumor Oncogenes by Disruption of Super-enhancers, Cell, 2013
Methylation of cytosines in DNA is a prototypic, stable, nearly universal mechanism of the mammalian epigenome. In domestic dogs, DNA methylation studies have been performed yet still lack epigenome-scale resolution. So far, public resources of functionally annotated dog genomes (i.e., BarkBase and DoGA) do not include methylome data. To profile global DNA methylome landscape of the dog, we performed genome-wide MBD-seq experiments on 11 somatic tissues. In these experiments, captured and enriched genomic DNA fragments covering a CpG are used to assay the total amount of methylation for a locus about the size of the fragments, which dictate the resolution of association signals.
Download: Location of CMRs and tsDMRs
- MethylAction: detecting differentially methylated regions that distinguish biological subtypes, Nucleic Acids Res., 2016