Identification and visualization of functional differential m6A methylation genes (FDmMGenes) and single base DmM sites. We also developed the Funm6AViewer webserver which is freely available from http://funm6aviewer.rnamd.com/ .
Funm6AViewer depends on GenomicFeatures, Guitar, trackViewer, DESeq2, STRINGdb, TxDb.Hsapiens.UCSC.hg19.knownGene, org.Hs.eg.db R packages and please make sure they are installed before installing Funm6AViewer. An R version >= 3.6 is suggested.
Install the required packages
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(c("GenomicFeatures", "GenomicAlignments", "Rsamtools", "Guitar", "trackViewer", "DESeq2", "apeglm", "STRINGdb",
"TxDb.Hsapiens.UCSC.hg19.knownGene", "org.Hs.eg.db"), version = "3.10")
Note that if your are using an R version lower than 3.6, please install the corresponding Bioconductor version and if you installed a different version of Bioconductor packages rather than version 3.10, you should check the STRINGdb
supported version and assign it to version
parameter of funm6aviewer
function. For example, by default, the version
of funm6aviewer
is set as version = "10" which is corresponding to STRINGdb version = 10 and Bioconductor version = 3.10; if you installed Bioconductor version = 3.11, the corresponding STRINGdb version should be "11", then the version
should be set as version = "11".
Install Funm6AViewer
if (!requireNamespace("devtools", quietly = TRUE))
install.packages("devtools")
devtools::install_github("NWPU-903PR/Funm6AViewer")
To test the installation, please run the following toy example:
library(Funm6AViewer)
dminfo <- system.file("extdata", "DMinfo_toy.xls", package="Funm6AViewer")
dminfo <- read.table(dminfo, header = TRUE, stringsAsFactors = FALSE)
bamreadsgr <- system.file("extdata", "bamgrlist_toy.RData", package="Funm6AViewer")
load(bamreadsgr)
re <- coverageplot(dminfo = dminfo, grlist = grlist, intrested_gene = "MYC")
Funm6AViewer adopted FunDMDeep-m6A to idenify functional DmM genes which required 4 PPI networks. Then to run Funm6AViewer, users need to firstly download this data from http://180.208.58.66/Funm6AViewer/Download/Funm6AViewer_data.zip
funm6aviewer
takes single base DmM sites information, gene DE information as input and output:
-
Functional DmM genes (FDmMGenes);
-
DmM sites distribution on RNA;
-
Counts of DmM sites on different RNA regions;
-
DmM sites and reads coverage on interested gene;
-
Function enrichment of FDmMGenes;
-
Context specific function of interested genes;
-
DmMGene's MSB score along with DE score;
-
Network of FDmMGene's MSB neighbors.
Following is an example to achieve these using funm6aviewer
:
library(Funm6AViewer)
Get input data:
dminfo <- system.file("extdata", "DMinfo_toy.xls", package="Funm6AViewer")
deinfo <- system.file("extdata", "DEinfo_toy.xls", package="Funm6AViewer")
dminfo <- read.table(dminfo, header = TRUE, stringsAsFactors = FALSE)
deinfo <- read.delim(deinfo, header = TRUE, stringsAsFactors = FALSE)
dminfo
contains the position annotation and log2 foldchange of DmM sites. It can be extracted from the result of DMDeep-m6A package using summarydmdeepm6A
(see 9.2 for more details). Alternatively, users can use any other method to make it as the following formate:
head(dminfo)
## chr chromStart chromEnd name score strand log2fd
## 1 chr1 155160832 155160833 4582 0.9137714 - 2.7950754
## 2 chr1 171505224 171505225 23215 0.9431386 + 0.7589479
## 3 chr1 241767682 241767683 23596 0.8125095 - 1.0680801
## 4 chr1 243418399 243418400 9859 0.9652586 - 1.7805035
## 5 chr1 8073372 8073373 54206 0.8477583 - 2.5678589
## 6 chr1 8073689 8073690 54206 0.8089832 - 1.1375522
The 'name' column can be entrez gene ID or gene symbol.
deinfo
contains the differential expresion p-value and fdr for genes. It can be made using makegrreadsfrombam
and getdeinfo
(see 9.2 for more details), or users can use any other method to make it as the following formate:
head(deinfo)
## name pval padj log2FoldChange
## 1 1 7.578860e-01 8.990406e-01 -0.07382034
## 2 100 6.958592e-01 8.695820e-01 0.07649361
## 3 1000 4.155368e-06 6.420489e-05 -0.73757946
## 4 10000 2.043250e-02 9.424864e-02 0.40075777
## 5 100009676 4.524888e-01 7.148682e-01 -0.16186352
## 6 10001 6.708161e-01 8.566831e-01 0.08459812
The 'name' column can be entrez gene ID or gene symbol.
bamreadsgr
can be generated using makegrreadsfrombam
from the MeRIP-Seq data in bam formate (see 9.2 for more details).
bamreadsgr <- system.file("extdata", "bamgrlist_toy.RData", package="Funm6AViewer")
load(bamreadsgr)
siggene <- c("CCNT1", "MYC", "BCL2")
permutime <- 1000
The datapath
is the filepath where the required PPI data saved and the enrich_input_directory
is the filepath passed to string_db, the GO and KEGG function annotation data will be downloaded to this path. All these required data can be downloaded from http://180.208.58.66/Funm6AViewer/Download/Funm6AViewer_data.zip
datapath <- "F:/Funm6A_package/data"
enrich_input_directory <- "F:/Funm6A_package/data"
savepath <- getwd()
re <- funm6aviewer(dminfo, deinfo, grlist, intrested_gene = siggene, permutime = permutime, version = "10",
datapath = datapath, enrich_input_directory = enrich_input_directory, savepath = savepath)
The version
parameter is passed to STRINGdb, it depends on which version is supported by STRINGdb. The results will be saved to savepath
.
If you are using other genomes, you need to install txdb and orgdb annotation for the corresponding genome. Taking mouse mm9 genome as an example, you should firstly install the genome annotation:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(c("TxDb.Mmusculus.UCSC.mm9.knownGene",
"org.Mm.eg.db"))
And assign the annotation to txdb
, orgdb
and orgsymbol
when running funm6aviewer:
library(TxDb.Mmusculus.UCSC.mm9.knownGene)
library(org.Mm.eg.db)
re <- funm6aviewer(dminfo = dminfo,
deinfo = deinfo,
grlist = grlist,
intrested_gene = siggene,
txdb = TxDb.Mmusculus.UCSC.mm9.knownGene,
orgdb = org.Mm.eg.db,
orgsymbol = org.Mm.egSYMBOL,
permutime = permutime,
datapath = datapath,
enrich_input_directory = enrich_input_directory,
savepath = savepath)
Users can use dmsiteplot
to visaulize the DmM sites on their interested genes and their isoforms.
Get input:
dminfo <- system.file("extdata", "DMinfo_toy.xls", package="Funm6AViewer")
dminfo <- read.table(dminfo, header = TRUE, stringsAsFactors = FALSE)
siggene <- c("MYC")
Make plot:
re <- dmsiteplot(dminfo = dminfo, intrested_gene = siggene)
Users can use coverageplot
to visaulize the reads coverage of DmM sites on their interested genes.
Get input:
dminfo <- system.file("extdata", "DMinfo_toy.xls", package="Funm6AViewer")
dminfo <- read.table(dminfo, header = TRUE, stringsAsFactors = FALSE)
bamreadsgr <- system.file("extdata", "bamgrlist_toy.RData", package="Funm6AViewer")
load(bamreadsgr)
siggene <- c("MYC", "CCNT1")
Make plot:
re <- coverageplot(dminfo = dminfo, grlist = grlist, intrested_gene = siggene)
Users can zoom the reads coverage near the differential m6A sites by setting the zoom_region
as following:
re <- coverageplot(dminfo = dminfo, grlist = grlist, intrested_gene = "CCNT1", zoom_region = 100)
Then 100 bp up- and down-stream of the sites on gene CCNT1 will be zoomed.
If users only hase a list of DmM genes and their DE information, then they can use fdmdeepm6A
to identify functional DmM genes (FDmMGenes).
DMgene
is a group of DmM genes, can be gene symbol or entrez gene ID.
dminfo <- system.file("extdata", "DMinfo_toy.xls", package="Funm6AViewer")
deinfo <- system.file("extdata", "DEinfo_toy.xls", package="Funm6AViewer")
dminfo <- read.table(dminfo, header = TRUE, stringsAsFactors = FALSE)
deinfo <- read.delim(deinfo, header = TRUE, stringsAsFactors = FALSE)
DMgene <- unique(dminfo$name)
descore <- getdescore(deinfo)
The datapath
is the filepath where the required PPI data saved and it can be downloaded from http://180.208.58.66/Funm6AViewer/Download/Funm6AViewer_data.zip
datapath <- "F:/Funm6A_package/data"
permutime <- 1000
Identify FDmMGenes:
re <- fdmdeepm6A(DMgene = DMgene, descore = descore, datapath = datapath, permutime = permutime)
Plot interested genes' MSB score:
siggene <- c("CCNT1", "MYC", "BCL2")
siggenescoreplot(fdmgene = re, siggene = siggene)
Users can visualize the context specific function of interested FDmMGenes identified by FunDMDeep-m6A using siggenepathplot
.
fdmgene
is a group of identified FDmMGnes. siggene
is interested FDmMGne.
dminfo <- system.file("extdata", "DMinfo_toy.xls", package="Funm6AViewer")
dminfo <- read.table(dminfo, header = TRUE, stringsAsFactors = FALSE)
fdmgene <- unique(dminfo$name)
siggene <- c("MYC")
The input_directory
is the filepath passed to string_db, the GO and KEGG function annotation data will be downloaded to this path. Users can also donwloaded the annotation data previously from http://180.208.58.66/Funm6AViewer/Download/Funm6AViewer_data.zip and set the input_directory
as where you save the data.
input_directory <- "F:/Funm6A_package/data"
re <- siggenepathplot(fdmgene = fdmgene, intrested_gene = siggene,
version = "11", input_directory = input_directory)
Users can visualize the MSB neighbours of interested FDmMGenes identified by FunDMDeep-m6A using msbnetplot
.
dmgene
is a group of DmM gnes. siggene
is interested FDmMGnes.
dminfo <- system.file("extdata", "DMinfo_toy.xls", package="Funm6AViewer")
deinfo <- system.file("extdata", "DEinfo_toy.xls", package="Funm6AViewer")
dminfo <- read.table(dminfo, header = TRUE, stringsAsFactors = FALSE)
deinfo <- read.delim(deinfo, header = TRUE, stringsAsFactors = FALSE)
dmgene <- unique(dminfo$name)
descore <- getdescore(deinfo)
siggene <- c("CCNT1", "MYC", "BCL2")
The datapath
is the filepath where the required PPI data saved and it can be downloaded from http://180.208.58.66/Funm6AViewer/Download/Funm6AViewer_data.zip
datapath <- "F:/Funm6A_package/data"
Plot for one FDmMGne:
re <- msbnetplot(genesymbol = siggene[1], dmgene = dmgene, descore = descore, datapath = datapath)
plot for several FDmMGnes:
re <- msbnetplot(genesymbol = siggene, dmgene = dmgene, descore = descore, datapath = datapath,
savename = "InterestedGene")
We introduced this pipeline using a pretend example MeRIP-Seq data which contains 2 replicates in bam format for each condition (untreated and treated), named as:
"untreated_input_rep1.bam","untreated_ip_rep1.bam",
"untreated_input_rep2.bam", "untreated_ip_rep2.bam",
"treated_input_rep1.bam", "treated_ip_replicate1.bam",
"treated_input_rep2.bam", "treated_ip_rep2.bam".
These bam format files can be obtained by aligning the raw sequenced MeRIP-Seq data to genome, i.e., human hg19 using splice junction mapping tools for RNA-Seq reads, like TopHat, HISAT or STAR.
Users can firstly call single base differential m6A methylation (DmM) sites using DMDeepm6A R package (https://github.com/NWPU-903PR/DMDeepm6A1.0) as following:
library(DMDeepm6A)
ip_bams <- c("treated_ip_rep1.bam"," treated_ip_rep2.bam",
"untreated_ip_rep1.bam ", "untreated_ip_rep2.bam")
input_bams <- c("treated_input_rep1.bam","treated_input_rep2.bam",
"untreated_input_rep1.bam", "untreated_input_rep2.bam")
sample_condition <- c("treated", "treated", "untreated", "untreated")
Assuming the aligned genome is hg19, users can call DmM sites as following:
output_filepath <- getwd()
re <- dmdeepm6A(ip_bams = ip_bams,
input_bams = input_bams,
sample_conditions = sample_condition,
output_filepath = output_filepath,
experiment_name = "DMDeepm6A_out")
See ?dmdeepm6A
for more details if you are using other genomes. The experiment_name
is the name of the file folder where saved the result of dmdeepm6A
.
Funm6AViewer takes single base DmM sites information, gene DE information as input and a list of GRanges
converted from MeRIP-Seq bam files using makegrreadsfrombam
if users would like to see the reads coverage of interested DmM sites on gene. Single base DmM sites information named dminfo
contains the position annotation and log2 foldchange of DmM sites. It can be extracted from the result of DMDeepm6A
using summarydmdeepm6A
as following:
dminfo <- summarydmdeepm6A(dmpath = " DMDeepm6A_out", sigthresh = 0.05)
dminfo
contains the following information:
head(dminfo)
## chr chromStart chromEnd name score strand log2fd
## 1 chr1 155160832 155160833 4582 0.9137714 - 2.7950754
## 2 chr1 171505224 171505225 23215 0.9431386 + 0.7589479
## 3 chr1 241767682 241767683 23596 0.8125095 - 1.0680801
## 4 chr1 243418399 243418400 9859 0.9652586 - 1.7805035
## 5 chr1 8073372 8073373 54206 0.8477583 - 2.5678589
## 6 chr1 8073689 8073690 54206 0.8089832 - 1.1375522
The ‘name’ column can be entrez gene ID or gene symbol.
Gene DE information named deinfo
contains the differential expresion p-value and fdr for genes. It can be made using makegrreadsfrombam
and getdeinfo
from MeRIP-seq input samples as following:
ip_bams <- c("treated_ip_rep1.bam"," treated_ip_rep2.bam",
"untreated_ip_rep1.bam ", "untreated_ip_rep2.bam")
input_bams <- c("treated_input_rep1.bam","treated_input_rep2.bam",
"untreated_input_rep1.bam", "untreated_input_rep2.bam")
sample_condition <- c("treated", "treated", "untreated", "untreated")
savepath <- getwd()
grlist <- makegrreadsfrombam(IP_bams = ip_bams,
Input_bams = input_bams,
condition = sample_condition,
minimal_alignment_MAPQ = 30,
txdb = TxDb.Hsapiens.UCSC.hg19.knownGene,
savepath = savepath)
deinfo <- getdeinfo(grlist = grlist,
txdb = TxDb.Hsapiens.UCSC.hg19.knownGene,
savepath = savepath)
The converted grlist
will be saved to savepath
named as " bamgrlist.RData ". Users can directly load it to use next time. If you are using other genomes please install the corresponding txdb annotation similar to TxDb.Hsapiens.UCSC.hg19.knownGene
of the genome and assign it to txdb
.
deinfo
contains the following information:
head(deinfo)
## name pval padj log2FoldChange
## 1 1 7.578860e-01 8.990406e-01 -0.07382034
## 2 100 6.958592e-01 8.695820e-01 0.07649361
## 3 1000 4.155368e-06 6.420489e-05 -0.73757946
## 4 10000 2.043250e-02 9.424864e-02 0.40075777
## 5 100009676 4.524888e-01 7.148682e-01 -0.16186352
## 6 10001 6.708161e-01 8.566831e-01 0.08459812
The ‘name’ column can be entrez gene ID or gene symbol.
Following is an example to achieve Functional m6A analysis using funm6aviewer
with previously generated dminfo
, deinfo
and grlist
:
siggene <- c("CCNT1", "MYC", "BCL2")
permutime <- 100*length(unique(dminfo$name))
The datapath
is the file path where the required PPI data saved and the enrich_input_directory
is the file path passed to string_db
, the GO and KEGG function annotation data will be downloaded and saved to this path. All these required data can be downloaded from http://180.208.58.66/Funm6AViewer/Download/Funm6AViewer_data.zip
datapath <- "~./Funm6AViewer_data"
enrich_input_directory <- "~./Funm6AViewer_data"
savepath <- getwd()
re <- funm6aviewer(dminfo = dminfo,
deinfo = deinfo,
grlist = grlist,
intrested_gene = siggene,
permutime = permutime,
datapath = datapath,
enrich_input_directory = enrich_input_directory,
savepath = savepath)
The results will be saved to savepath
.