Skip to content

Co-accessibility network from single-cell ATAC-seq data. Python code, based on Cicero package (R).

License

Notifications You must be signed in to change notification settings

cantinilab/Circe

Repository files navigation

Circe logo


CIRCE: Cis-regulatory interactions between chromatin regions

Unit_Tests Wheels codecov PyPI version Downloads

Description

This repo contains a python package for inferring co-accessibility networks from single-cell ATAC-seq data, using skggm for the graphical lasso and scanpy for data processing.

It is based on the pipeline and hypotheses presented in the manuscript "Cicero Predicts cis-Regulatory DNA Interactions from Single-Cell Chromatin Accessibility Data" by Pliner et al. (2018). This R package Cicero is available here.

Installation

The package can be installed using pip:

pip install circe-py

and from github

pip install "git+https://github.com/cantinilab/circe.git"

Minimal example

import anndata as ad
import circe as ci

# Load the data
atac = ad.read_h5ad('atac_data.h5ad')
atac = ci.add_region_infos(atac)

# Compute the co-accessibility network
ci.compute_atac_network(atac)

# Extract the network and find CCANs modules
circe_network = ci.extract_atac_links(atac)
ccans_module = ci.find_ccans(atac)

Visualisation

fig, ax = plt.subplots(1, figsize = (20, 6))
genes_df = ci.downloads.download_genes()

ci.draw.plot_connections_genes(
    connections=atac,  # Main parameters
    genes=genes_df,
    chromosome="chr1",
    start=50_000,
    end=300_000,
    gene_spacing=30_000,
    abs_threshold=0.0,
    y_lim_top=-0.01,   # Visual parameters
    track_spacing=0.01,
    track_width=0.01,
    ax=ax
)

Comparison to Cicero R package


Metacalls computation might create differences, but scores will be identical applied to the same metacalls (cf comparison plots below). It should run significantly faster than Cicero (e.g.: running time of 5 sec instead of 17 min for the dataset 2).

If you have any suggestion, don't hesitate ! This package is still a work in progress :)
On the same metacells obtained from Cicero code.

All tests can be found in the circe benchmark repo

Real dataset 2 - subsample of 10x PBMC (2021)

  • Pearson correlation coefficient: 0.999958
  • Spearman correlation coefficient: 0.999911

Performance on real dataset 2:

  • Runtime: ~100x faster
  • Memory usage: ~5x less

Coming:

  • Calculate metacells !
  • Add stats on similarity on large datasets.
  • Add stats on runtime, memory usage.
  • Implement the multithreading use. Should speed up even more.
  • Fix seed for reproducibility.

Usage

It is currently developped to work with AnnData objects. Check Example1.ipynb for a simple usage example.

Citation

Trimbour Rémi (2025). Circe: Co-accessibility network from ATAC-seq data in python (based on Cicero package). Package version 0.3.6.