Skip to content

ebareke/cccTeqy

πŸš€ cccTeqy

NGS data β€” Processing Automation Engine
Full-stack, production-grade, HPC-enabled epigenomics automation pipeline.

cccTeqy is a fully automated, modular, and high-performance pipeline designed to process ChIP-seq, CUT&RUN, and CUT&Tag datasets at scale β€” from raw FASTQ to publication-ready QC metrics, peak calls, bigWigs, and consolidated MultiQC reports.

This README reflects the latest full pipeline implementation, built around the main Bash script:
πŸ“Œ run.sh** β€” the central engine of cccTeqy**


✨ Features

  • Full Workflow Automation: FASTQ β†’ Peaks/QC/bigWigs/MultiQC
  • Supports ChIP-seq, CUT&RUN, CUT&Tag, SE & PE
  • Unified YAML-based configuration (no code editing)
  • Local + HPC support: SLURM & PBS
  • Advanced QC suite:
    • FastQC
    • Picard duplication metrics
    • Preseq library complexity
    • deepTools fragment profiling
    • PhantomPeak cross-correlation (NSC, RSC)
    • FRiP scoring
    • MultiQC
  • Peak Calling: MACS2 (narrow/broad)
  • Container-ready: Docker & Singularity configurations included
  • Production documentation stack: README, CHANGELOG, WIKI

πŸ“ Repository Structure

cccTeqy/
β”‚
β”œβ”€β”€ run.sh                         # Main pipeline script
β”œβ”€β”€ config.yaml                    # Example config
β”œβ”€β”€ config.container.example.yaml  # Container-optimized config
β”œβ”€β”€ samples.tsv                    # Example sample sheet
β”‚
β”œβ”€β”€ Dockerfile                     # Docker build file
β”œβ”€β”€ Singularity                    # Singularity definition file
β”œβ”€β”€ environment.yml                # Conda environment
β”œβ”€β”€ Makefile                       # Docker/Singularity build automation
β”‚
β”œβ”€β”€ README.md                      # Main documentation (this file)
β”œβ”€β”€ CHANGELOG.md                   # Version history
└── wiki/                          # Exported GitHub Wiki pages

πŸ”§ Requirements (Native Execution)

Install system tools if not using containers:

  • bwa, samtools, bedtools
  • fastqc, macs2
  • picard, preseq
  • deepTools: bamCoverage, bamPEFragmentSize
  • Rscript, phantompeakqualtools (run_spp.R)
  • multiqc

Conda option

mamba env create -f environment.yml
mamba activate cccteqy

βš™οΈ Quick Start (Native Mode)

1. Configure the pipeline

A minimal working config.yaml:

project_name: MyProject
outdir: outputs
run_mode: local
threads: 16
bwa_index: /path/to/hg38/bwa/index
blacklist_bed: /path/to/hg38/blacklist.bed
rscript: Rscript
phantompeak_rscript: /opt/tools/run_spp.R

2. Prepare your sample sheet

SAMPLE_ID FASTQ1 FASTQ2 ASSAY MARK CONTROL_ID LIBTYPE
S1 R1.fq.gz R2.fq.gz CUTTAG H3K27me3 None PE

3. Run pipeline

./run.sh -c config.yaml -s samples.tsv

4. Single-sample processing

./run.sh -c config.yaml -s samples.tsv --run-single S1

🐳 Docker Execution

The repository includes a BYO-config Dockerfile.

Build

docker build -t cccteqy:latest .

Run

docker run --rm \
  -v $PWD:/work -v /data:/data -w /work \
  cccteqy:latest ./run.sh -c config.yaml -s samples.tsv

More details: dockerhub-README.md


πŸ“¦ Singularity Execution

Build

singularity build cccteqy.sif Singularity

Run

singularity exec -B $PWD:/work -B /data:/data cccteqy.sif \
  ./run.sh -c config.yaml -s samples.tsv

More details: singularityhub-README.md


πŸ§ͺ Workflow Overview

cccTeqy animated workflow overview


πŸ“š Documentation

Full documentation available in the GitHub Wiki, including:

  • Installation
  • Configuration guide
  • Sample sheet guide
  • QC module explanations
  • Developer documentation
  • Troubleshooting

Wiki pages also included locally under wiki/.


πŸ”₯ Changelog

See CHANGELOG.md.


🧱 Container-Based Configuration Example

A lightweight configuration specifically for use inside Docker/Singularity:

project_name: DemoContainerRun
outdir: /work/outputs
run_mode: local
threads: 8
bwa_index: /data/ref/hg38/bwa/hg38
blacklist_bed: /data/ref/hg38/blacklist/hg38-blacklist.v2.bed
rscript: Rscript
phantompeak_rscript: /opt/conda/bin/run_spp.R

🀝 Contributing

We welcome:

  • New QC modules
  • New peak callers
  • Workflow optimizations
  • Container enhancements
  • Documentation improvements

Contribute here:
πŸ‘‰ https://github.com/ebareke/cccTeqy/issues


πŸͺ License

MIT License β€” open, reusable, extensible.