Recipe improvements

# RNA-seq analysis recipe

- [ ] Try a larger k for the UMAP graph to have a more connected or spread out lineage similar to a ATAC. 

# INT analysis recipe

- [x]  run UMAP on unintegrated but normalized data and ping RB
- [ ]  rerun new CFA analyses
- [ ]  (LD) The only thing you could add is a few words to explain that the approach of only considering promoter/TSS regions is appropriate for this analysis, but it might miss important regulatory elements in non-promoter regions. Just to make it clear that the users would probably want to also perform other analyses specifically for their ATAC modality to complement this approach.
- [ ]  Weaken the PCA claims in both spilterlize and unsupervised analysis. change "exposes” to “shows”.
- [ ]  add to the correlation plot "colored by divergent genes EP red; TA blue”
- [ ]  incorporate JB's comments (physical copy on my desk)
- [ ]  adapt enrichment analysis claims in recipe according to paper figs
- [ ] Clarify that input is log-transformed, implying usage of limma-trend, irrespective of what happened before to the data #67 
- [ ] Create custom TA (& EP?) database for scCRISPR-seq recipe using Snakemake  like shown here: https://github.com/epigen/enrichment_analysis/issues/24

# Crossprediction script

- [ ] Extend script to also save the top negative predictors (neg) and rename positive predictors (pos). Adapt RNA and ATAC recipe to state that neg is also provided, but not shown ie weakening the current limitation.

# general

- [ ] consider making a separate resource.smk that houses all resource downloads which get triggered when required by downstream rules (or **too complicated** bc then the recipe SMKs are not self-contained anymore) 
- [ ] download of all remaining resources required for recipes using wget and rename from .txt to .gmt to make it even more self-contained and automated
```python
####  download Enrichr databases #### 
rule download_Enrichr_databases:
    output:
        "resources/Enrichr_databases/{database}.gmt",
    params:
        url = lambda w: "https://maayanlab.cloud/Enrichr/geneSetLibrary?mode=text&libraryName={}".format(w.database),
    resources:
        mem_mb="1000",
    threads: 1
    conda:
        "../envs/wget.yaml"
    log:
        "logs/wget/download_Enrichr_databases_{database}.log",
    shell:
        "wget -O {output} '{params.url}' > {log} 2>&1"
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Recipe improvements #65

RNA-seq analysis recipe

INT analysis recipe

Crossprediction script

general

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Recipe improvements #65

Description

RNA-seq analysis recipe

INT analysis recipe

Crossprediction script

general

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions