Outputs are in Tables/h37_tss_selected_gr.bed and in Tables/clusters_selected_genes.csv
2 - fantom5_links.R
The links are in Tables/FANTOM5_files.txt and the whole sample information table of the in Tables/FANTOM5_files_info.tsv
3 - FANTOM5_files.sh
4 - init_CAGEfiles.R
Output to Tables/true_tss_activ_tissue.rds, with a dataframe for each cell type/tissue in each element from the list. With Tables/true_tss_activ_tissue.rds as input, firstly it restricts to the promoters in Tables/clusters_selected_genes.csv.
Secondly, summarizes data from all samples, to filter the active promoters along multiple samples (Tables/active_promoters.tsv) and those highly tissue specific (Tables/tissue_restricted/prom.tsv).
6 -distances.R
Output in Tables/active_promoters_df.csv Firstly, identify the IC95 of the width of the selected TSS clusters.
Select the pairs of promoters from each gene that are within this genomic distance.
proximal_promoters.rds contains the output. Variance of relative activity of promoters in proximal promoters (Tables/proximal_promoters.rds ) along the different human cell types (Tables/true_tss_activ_tissue.rds).
Control is assessed by permuting the pairs of promoters