-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
RNA-seq analysis recipe
- Try a larger k for the UMAP graph to have a more connected or spread out lineage similar to a ATAC.
INT analysis recipe
- run UMAP on unintegrated but normalized data and ping RB
- rerun new CFA analyses
- (LD) The only thing you could add is a few words to explain that the approach of only considering promoter/TSS regions is appropriate for this analysis, but it might miss important regulatory elements in non-promoter regions. Just to make it clear that the users would probably want to also perform other analyses specifically for their ATAC modality to complement this approach.
- Weaken the PCA claims in both spilterlize and unsupervised analysis. change "exposes” to “shows”.
- add to the correlation plot "colored by divergent genes EP red; TA blue”
- incorporate JB's comments (physical copy on my desk)
- adapt enrichment analysis claims in recipe according to paper figs
- Clarify that input is log-transformed, implying usage of limma-trend, irrespective of what happened before to the data Integrative recipe runs voom + integration in spilterlize, even though one cannot use that for log-normalized counts #67
- Create custom TA (& EP?) database for scCRISPR-seq recipe using Snakemake like shown here: download resources helper script enrichment_analysis#24
Crossprediction script
- Extend script to also save the top negative predictors (neg) and rename positive predictors (pos). Adapt RNA and ATAC recipe to state that neg is also provided, but not shown ie weakening the current limitation.
general
- consider making a separate resource.smk that houses all resource downloads which get triggered when required by downstream rules (or too complicated bc then the recipe SMKs are not self-contained anymore)
- download of all remaining resources required for recipes using wget and rename from .txt to .gmt to make it even more self-contained and automated
#### download Enrichr databases ####
rule download_Enrichr_databases:
output:
"resources/Enrichr_databases/{database}.gmt",
params:
url = lambda w: "https://maayanlab.cloud/Enrichr/geneSetLibrary?mode=text&libraryName={}".format(w.database),
resources:
mem_mb="1000",
threads: 1
conda:
"../envs/wget.yaml"
log:
"logs/wget/download_Enrichr_databases_{database}.log",
shell:
"wget -O {output} '{params.url}' > {log} 2>&1"