NGS data β Processing Automation Engine
Full-stack, production-grade, HPC-enabled epigenomics automation pipeline.
cccTeqy is a fully automated, modular, and high-performance pipeline designed to process ChIP-seq, CUT&RUN, and CUT&Tag datasets at scale β from raw FASTQ to publication-ready QC metrics, peak calls, bigWigs, and consolidated MultiQC reports.
This README reflects the latest full pipeline implementation, built around the main Bash script:
π run.sh** β the central engine of cccTeqy**
- Full Workflow Automation: FASTQ β Peaks/QC/bigWigs/MultiQC
- Supports ChIP-seq, CUT&RUN, CUT&Tag, SE & PE
- Unified YAML-based configuration (no code editing)
- Local + HPC support: SLURM & PBS
- Advanced QC suite:
- FastQC
- Picard duplication metrics
- Preseq library complexity
- deepTools fragment profiling
- PhantomPeak cross-correlation (NSC, RSC)
- FRiP scoring
- MultiQC
- Peak Calling: MACS2 (narrow/broad)
- Container-ready: Docker & Singularity configurations included
- Production documentation stack: README, CHANGELOG, WIKI
cccTeqy/
β
βββ run.sh # Main pipeline script
βββ config.yaml # Example config
βββ config.container.example.yaml # Container-optimized config
βββ samples.tsv # Example sample sheet
β
βββ Dockerfile # Docker build file
βββ Singularity # Singularity definition file
βββ environment.yml # Conda environment
βββ Makefile # Docker/Singularity build automation
β
βββ README.md # Main documentation (this file)
βββ CHANGELOG.md # Version history
βββ wiki/ # Exported GitHub Wiki pages
Install system tools if not using containers:
- bwa, samtools, bedtools
- fastqc, macs2
- picard, preseq
- deepTools:
bamCoverage,bamPEFragmentSize - Rscript, phantompeakqualtools (
run_spp.R) - multiqc
mamba env create -f environment.yml
mamba activate cccteqyA minimal working config.yaml:
project_name: MyProject
outdir: outputs
run_mode: local
threads: 16
bwa_index: /path/to/hg38/bwa/index
blacklist_bed: /path/to/hg38/blacklist.bed
rscript: Rscript
phantompeak_rscript: /opt/tools/run_spp.RSAMPLE_ID FASTQ1 FASTQ2 ASSAY MARK CONTROL_ID LIBTYPE
S1 R1.fq.gz R2.fq.gz CUTTAG H3K27me3 None PE
./run.sh -c config.yaml -s samples.tsv./run.sh -c config.yaml -s samples.tsv --run-single S1The repository includes a BYO-config Dockerfile.
docker build -t cccteqy:latest .docker run --rm \
-v $PWD:/work -v /data:/data -w /work \
cccteqy:latest ./run.sh -c config.yaml -s samples.tsvMore details: dockerhub-README.md
singularity build cccteqy.sif Singularitysingularity exec -B $PWD:/work -B /data:/data cccteqy.sif \
./run.sh -c config.yaml -s samples.tsvMore details: singularityhub-README.md
Full documentation available in the GitHub Wiki, including:
- Installation
- Configuration guide
- Sample sheet guide
- QC module explanations
- Developer documentation
- Troubleshooting
Wiki pages also included locally under wiki/.
See CHANGELOG.md.
A lightweight configuration specifically for use inside Docker/Singularity:
project_name: DemoContainerRun
outdir: /work/outputs
run_mode: local
threads: 8
bwa_index: /data/ref/hg38/bwa/hg38
blacklist_bed: /data/ref/hg38/blacklist/hg38-blacklist.v2.bed
rscript: Rscript
phantompeak_rscript: /opt/conda/bin/run_spp.RWe welcome:
- New QC modules
- New peak callers
- Workflow optimizations
- Container enhancements
- Documentation improvements
Contribute here:
π https://github.com/ebareke/cccTeqy/issues
MIT License β open, reusable, extensible.