Targeting genes sources
|
VARAdb
|
A comprehensive human variation annotation database, which aims to provide a large number of variations and annotate their potential roles with a large amount of regulatory information.
The current version of VARAdb cataloged a total of 577,283,813 variations and provided annotation including motif changes, risk SNPs, LD SNPs, eQTLs, clinical variant-drug-gene pairs, sequence conservation, somatic mutations, etc.
|
http://www.licpathway.net/VARAdb/index.php
|
download
|
download |
GeneHancer
|
GeneHancer is a database of genome-wide enhancer-to-gene and promoter-to-gene associations, embedded in GeneCards. It integrated a total of 434 000 reported enhancers from four different genome-wide databases: the Encyclopedia of DNA Elements (ENCODE), the Ensembl regulatory build, the functional annotation of the mammalian genome (FANTOM) project and the VISTA Enhancer Browser.
|
http://www.genecards.org/
|
Not available
|
download |
JEME
|
A new method for determining the target genes of transcriptional enhancers in specific cells and tissues. It combines global trends across many samples and sample-specific information, and considers the joint effect of multiple enhancers.
In the paper, they reconstructed the enhancer–target networks in 935 samples of human primary cells, tissues and cell lines, which constitute by far the largest set of enhancer–target networks.
|
http://yiplab.cse.cuhk.edu.hk/jeme/
|
download
|
download |
Regulatory elements sources
|
dbSUPER
|
dbSUPER is the first integrated and interactive database of super-enhancers, which contains 82234 super-enhancers in 102 human and 25 mouse tissue/cell types.
|
http://www.asntech.org/dbsuper/
|
download
|
download |
EnhancerAtlas
|
The database provides enhancer annotation in nine species, including human (hg19), mouse (mm9), fly (dm3), worm (ce10), zebrafish (danRer10), rat (rn5), yeast (sacCer3), chicken (galGal4), and boar (susScr3). The consensus enhancers were predicted based on multiple high throughput experimental datasets (e.g. histone modification, CAGE, GRO-seq, transcription factor binding and DHS). Currently, the updated database contains 13,494,603 enhancers for 586 tissue/cell types.
|
http://enhanceratlas.org/
|
download
|
download |
SEA
|
SEA stores the predicted super-enhancers and enhancers of 11 species, expands the types of Cell/Tissue/Disease from 134 to 246 and includes experimentally identified and confirmed super-enhancers. It lists the functional compositional organization for each SE through Hi-C data based peak calling and provides the cell-type/tissue/disease specificity of super-enhancer with an quantitative entropy value.
|
http://sea.edbc.org/
|
download
|
download |
SEdb
|
Super enhancers are large clusters of enhancers with a higher degree of enrichment for TFs, higher levels of transcription and stronger cell type specificity. We downloaded 331,146 super enhancers from SEdb identified by H3K27ac ChIP-seq samples.
|
http://www.licpathway.net/sedb/
|
download
|
download |
ChromHMM
|
ChromHMM is software for learning and characterizing chromatin states. ChromHMM can integrate multiple chromatin datasets such as ChIP-seq data of various histone modifications to discover de novo the major re-occuring combinatorial and spatial patterns of marks.
Chromatin states were predicted using the core 25-state ChromHMM model trained on the imputed data for 12 marks, H3K4me1, H3K4me2, H3K4me3, H3K9ac, H3K27ac, H4K20me1, H3K79me2, H3K36me3, H3K9me3, H3K27me3, H2A.Z, and DNase I hypersensitive sites (DHSs), across all 127 reference epigenomes.
|
http://compbio.mit.edu/ChromHMM/
|
download
|
download |
Chromatin interaction sources
|
ENCODE
|
Physical interactions between distal regulatory elements have a key role in regulating gene expression, but the extent to which these interactions vary between cell types and contribute to cell-type-specific gene expression remains unclear.
This study mapped cohesin-mediated chromatin loops, using chromatin interaction analysis by paired-end tag sequencing (ChIA-PET), and analysed gene expression in 24 diverse human cell types, including core ENCODE cell lines.
|
Landscape of cohesin-mediated chromatin loops in the human genome
|
download
|
download |
4DGenome
|
Records in 4DGenome are compiled through comprehensive literature curation of experimentally-derived and computationally-predicted interactions. The current release contains 4,433,071 experimentally-derived and 3,605,176 computationally-predicted interactions in 5 organisms.
|
https://4dgenome.research.chop.edu/
|
download
|
download |
OncoBase
|
OncoBase employed EpiTensor to obtain 25 222 085 high-resolution (~200 bp) chromatin interactions, including 2847794, 5691699, and 16682592 interactions for promoter to promoter, enhancer to promoter and enhancer to enhancer, respectively.
|
http://www.oncobase.biols.ac.cn
|
download
|
download |
Chromatin access sources
|
ATACdb
|
ATACdb documented a total of 52,078,883 regions from over 1,400 chromatin accessibility ATAC-seq samples. These samples have been manually curated from more than 2,200 chromatin accessibility samples associated with ATAC-seq data from NCBI GEO/SRA database.
|
http://www.licpathway.net/ATACdb/
|
download
|
download |
Cistrome
|
Open chromatin can be identified using ATAC-seq and Dnase-seq, which is reported to have multiple regulatory elements enriched and embed variations with regulation of distal gene resulting in heterogeneity. We cataloged accessible regions of 99 ATAC-seq samples from Cistrome.
|
http://cistrome.org/db/
|
Not available
|
download |
TCGA_ATAC
|
Open chromatin can be identified using ATAC-seq and Dnase-seq, which is reported to have multiple regulatory elements enriched and embed variations with regulation of distal gene resulting in heterogeneity. We cataloged accessible regions of ATAC-seq samples across 23 cancer types from The Cancer Genome Atlas (TCGA).
|
https://portal.gdc.cancer.gov/
|
download
|
download |
ENCODE_DHS
|
Open chromatin can be identified using ATAC-seq and Dnase-seq, which is reported to have multiple regulatory elements enriched and embed variations with regulation of distal gene resulting in heterogeneity. We downloaded this data from VARAdb which cataloged the accessible regions of 243 Dnase-seq samples from ENCODE.
|
https://www.encodeproject.org/
|
download
|
download |
Epigenetic regulation sources
|
TF(Transcription factor)
|
We downloaded this data from VARAdb which collected a total of 7734 TF ChIP-seq samples from ENCODE, Remap, Cistrome, ChIP-Atlas and GTRD and obtained 761 TFs.
|
http://www.licpathway.net/VARAdb/
|
download
|
download |
HM_ENCODE
|
From ENCODE and Roadmap, we obtained histone modifications (H3K36me3, H3K4me1, H3K4me3, H3K79me2, H4K20me1 and H3K9ac) involved 686 ChIP-seq samples.
|
https://www.encodeproject.org/
|
download
|
download |
HM_Roadmap
|
From ENCODE and Roadmap, we obtained histone modifications (H3K36me3, H3K4me1, H3K4me3, H3K79me2, H4K20me1 and H3K9ac) involved 686 ChIP-seq samples.
|
http://www.roadmapepigenomics.org/
|
download
|
download |
Genetic variants and eQTL sources
|
OncoBase
|
The database collected somatic mutations from four databases, including 1823257 somatic mutations in 36 cancer types from TCGA, 77462290 somatic mutations in 84 cancer projects from ICGC, 21392393 somatic mutations from COSMIC, and 345849 clinical variants from ClinVar. In total, 81385242 somatic mutations in 68 cancer types from more than 120 cancer projects.
|
http://www.oncobase.biols.ac.cn
|
download
|
download |
Gene4Denovo
|
Gene4Denovo integrated 670,082 de novo mutations (DNMs), including 73,856 coding DNMs from 58,011 individuals, across 28 types of phenotypes.
|
http://www.genemed.tech/gene4denovo/
|
download
|
download |
dbSNP
|
We obtained variants from dbSNP v155 which contains human single nucleotide variations, microsatellites, and small-scale insertions and deletions along with publication, population frequency, molecular consequence, and genomic and RefSeq mapping information for both common variations and clinical mutations.
|
https://www.ncbi.nlm.nih.gov/snp/
|
download
|
download |
gnomAD
|
The Genome Aggregation Database (gnomAD) is a resource developed by an international coalition of investigators, with the goal of aggregating and harmonizing both exome and genome sequencing data from a wide variety of large-scale sequencing projects, and making summary data available for the wider scientific community.
|
https://gnomad.broadinstitute.org/
|
download
|
download |
GTEx_eQTL
|
The GTEx (Genotype-Tissue Expression) Project identified genetic variants that influence how genes are turned on and off in human tissues and organs. Genetic variants that influence how genes behave are called expression quantitative trait loci.
We collected a total of 71478479 significant eQTL (FDR ≤ 0.05) in 49 human tissues from the GTEx project version 8.
|
https://gtexportal.org/
|
download
|
download |
GWAS_Catalog
|
Genome-wide association studies are providing a large number of data associating genetic variants with diseases and phenotypes. We collected 272608 risk SNPs associated with diseases or traits or phenotypes from The NHGRI-EBI Catalog of human genome-wide association studies (GWAS Catalog) v1.0.
|
https://www.ebi.ac.uk/gwas/
|
download
|
download |
GWASdb
|
Genome-wide association studies are providing a large number of data associating genetic variants with diseases and phenotypes. We collected 314239 risk SNPs associated with diseases or traits or phenotypes from GWASdb v2.0.
|
http://jjwanglab.org/gwasdb
|
download
|
download |
PancanQTL
|
PancanQTL aims to comprehensively provide cis-eQTLs (SNPs affect local gene expression) and trans-eQTLs (SNPs affect distant gene expression) in 33 cancer types from The Cancer Genome Atlas (TCGA).
We collected 5837775 significant (FDR ≤ 0.05) eQTL-gene pairs in 33 cancer types from PancanQTL database.
|
http://gong_lab.hzau.edu.cn/PancanQTL/
|
download
|
download |