Skip to content

CellProfiling/2dShapeSpacePortable

Repository files navigation

2dShapeSpacePortable

2dShapeSpacePortable is a structured, self-contained pipeline for quantifying the shape of cells and nuclei from 2D fluorescence microscopy images and mapping how proteins distribute within those shapes. Starting from pre-segmented masks, it encodes each cell's contour as Fourier coefficients, reduces the population to a compact set of principal shape modes, and measures per-cell protein intensity in a shape-aware coordinate system — making it possible to compare protein localizations across the cell shape space.

This portable version has been created to be a more generalizable version of the research work performed by Trang Le and Will Leineweber in the Cell Shape paper: https://www.sciencedirect.com/science/article/pii/S2405471226000712?via%3Dihub

Requirements and Installation

  • Python 3.10 or higher (tested on 3.11 and 3.12)
  • The GUI (GUI.py) additionally requires python3-tk:
    • Linux: sudo apt-get install python3-tk
    • macOS: included with the standard Python installer
    • Windows: included with the standard Python installer

Setting up the virtual environment

Navigate to the directory where you want to install the project:

cd /path/to/your/working/directory

Create and activate a virtual environment:

python3 -m venv 2dShapeSpacePortable
source 2dShapeSpacePortable/bin/activate          # Linux / macOS
2dShapeSpacePortable\Scripts\activate.bat         # Windows

Place the project files inside the virtual environment directory, then install all dependencies:

pip install -r requirements.txt

Purpose

The pipeline answers a practical question in cell biology: given a population of cells, what are the characteristic shapes they adopt, and where does a protein of interest tend to sit within those shapes?

Each pipeline step addresses one part of that question:

Step 1 — FFT coefficients (fftcoeff_step) The outline of each cell and its nucleus is traced and described mathematically using Fourier coefficients. This converts an irregular biological shape into a compact, comparable set of numbers, while also recording basic cell statistics such as area and protein intensity. Cells whose nucleus-to-cell size ratio is abnormally large (likely segmentation errors) are flagged and excluded from downstream steps.

Step 2 — Shape modes (shapemode_step) Principal Component Analysis (PCA) is applied to the Fourier coefficients of all cells together. This identifies the main "axes of variation" in shape across the dataset — the most common ways in which cells differ from the average cell. The result is a low-dimensional shape space where each cell occupies a position, and the principal components (PC1, PC2, …) each describe an interpretable shape deformation. An average cell shape is also computed and saved.

Step 3 — Protein parametrization (protparam_step) For each cell, the protein fluorescence signal is sampled in a shape-aware coordinate system anchored to the cell and nuclear contours. Two modes are available: rings samples the protein along concentric isocontours interpolated between the nucleus and cell membrane, producing a compact 2D intensity map; warp morphs the cell image into the average cell shape using thin-plate spline warping, so all cells can be directly compared pixel-by-pixel. Per-PC-bin averages are then computed, showing the typical protein distribution for cells of each shape.

Step 4 — Comparison (comparison_step) If cells carry a known protein location label (e.g. cytoplasm, nucleus, vesicles), this step groups them by label and computes per-label average intensity maps within each PC bin. A Pearson correlation heatmap is then generated comparing all location labels against each other, revealing which protein patterns co-vary across the shape space.

Running 2dShapeSpacePortable

Activate your virtual environment, navigate to the project directory, and run either:

python GUI.py        # graphical launcher — configure and run from a form
python process.py    # command-line launcher — uses config.yaml and optional CLI flags

Input data

Each cell must be provided as three separate single-channel grayscale images:

  • a nucleus mask (binary or label image)
  • a cell mask (binary or label image)
  • a protein channel image (raw fluorescence intensity)

All images for a given cell should have the same pixel dimensions.


path_list.csv

The pipeline reads the list of cells to process from path_list.csv in the project directory. Each row describes one cell:

#image_id,nuclei_mask,cell_mask,protein,location
cell_001,input/cell_001_nuc.png,input/cell_001_cell.png,input/cell_001_prot.png,Nucleus
cell_002,input/cell_002_nuc.png,input/cell_002_cell.png,input/cell_002_prot.png,Cytoplasm
Column Description
image_id Unique identifier for the cell; used as filename stem throughout all outputs
nuclei_mask Path to the nucleus mask image (relative to the project directory)
cell_mask Path to the cell mask image (relative to the project directory)
protein Path to the protein fluorescence channel image
location Subcellular location label for the protein (e.g. Nucleus, Cytoplasm). Used by the comparison step; set to any placeholder if unknown

Lines beginning with # are treated as comments and ignored.


Configuration

All parameters can be set in config.yaml, overridden by CLI flags, or configured interactively via the GUI. Priority order: CLI flags > config.yaml > built-in defaults.

General

Parameter Default Description
output_dir results Directory where all output files are written; created if it does not exist
plot True Generate intermediate diagnostic plots alongside the main outputs
seed 0 Random seed passed to PCA for reproducibility

FFT coefficients step

Parameter Default Description
fftcoeff_step True Run the FFT coefficient extraction step
n_coeffs 128 Number of Fourier coefficients used to describe each contour; higher values capture finer shape detail at the cost of increased dimensionality
alignment fft_major_axis_polarized Contour alignment method before coefficient extraction. fft_major_axis rotates the cell to align its longest axis horizontally; fft_major_axis_polarized additionally flips the cell so the nucleus is always on the same side; fft_centroid aligns based on the centroid position only
dismiss_ratio 8 Cells with a cell-area-to-nucleus-area ratio above this threshold are excluded from the shape modes step as likely segmentation artefacts

Shape modes step

Parameter Default Description
shapemode_step True Run the PCA shape modes step; requires FFT coefficients output

Protein parametrization step

Parameter Default Description
protparam_step True Run the protein parametrization step; requires FFT coefficients output
protparam_mode rings Parametrization method. rings samples protein intensity along concentric isocontours between nucleus and cell membrane (fast, shape-independent). warp morphs each cell into the average cell shape using thin-plate spline warping before sampling (slower, requires shape modes output, enables direct pixel-level comparison)

Comparison step

Parameter Default Description
comparison_step False Run the location comparison step; requires protein parametrization and shape modes output, and meaningful location values in path_list.csv

Output

All outputs are written under output_dir (default: results/).

FFT coefficients stepresults/shapespace/

File Description
fft_coeffs.csv One row per cell containing: image, nuc_area, cell_area, prot_int_sum_nuc, prot_int_sum_cell, theta (alignment angle), centroid_y, centroid_x, e_c (cell eccentricity), e_n (nucleus eccentricity), followed by n_coeffs × 4 Fourier coefficient columns (nucleus x/y, cell x/y)
{image_id}_fft_reconstruction.png (if plot=True) Overlay of the original contour and the FFT reconstruction for visual quality control

Shape modes stepresults/shapemode/

File Description
Avg_cell.npz Numpy archive with the average cell contour points (ix_n, iy_n, ix_c, iy_c) used by the warp mode and downstream steps
Avg_cell.jpg Plot of the average nucleus and cell membrane contours
PCA_scree.jpg Scree plot showing explained variance per PC with cumulative threshold markers
shapevar_PC{n}.png Strip of 7 shape outlines showing the cell deformation along PCn from −1.5 to +1.5 standard deviations
shapevar_PC{n}.gif Animated version of the shape variation strip
shapevar_PC{n}_hist.jpg Histogram of per-cell PCn scores with bin boundary markers
shapevar_PC{n}.npz Raw nucleus and membrane contour arrays for each variation step along PCn
cells_assigned_to_pc_bins.json Dictionary mapping each PC to a list of 7 bins, each containing the image_id values of cells assigned to that bin

Protein parametrization stepresults/protparam/

File Description
{image_id}_protein.npy (rings mode) 2D array of sampled protein intensities; rows are isocontours from nucleus to membrane, columns are points along the contour
{image_id}_protein_interpolation.png (rings mode, plot=True) Diagnostic image showing the rotated cell, protein channel, isocontour sampling grid, and sampled intensities
{image_id}_warp.png (warp mode) Protein channel image warped into the average cell shape coordinate system
{image_id}_warp_plot.png (warp mode, plot=True) Five-panel diagnostic showing the original shape, resized images, and each warping stage
avg/{PC}_bin{idx}_protein.npy (rings mode) Average intensity map across all cells in PC bin idx
avg/{PC}_bin{idx}_protein.png (rings mode, plot=True) Heatmap of the averaged intensity map
avg/{PC}_bin{idx}_warp.png (warp mode) Average warped protein image across all cells in PC bin idx

Comparison stepresults/comparison/

File Description
avg_by_location/{PC}_bin{idx}_{location}.npy Average protein intensity array for cells of a given location label within a PC bin
heatmaps/{PC}_bin{idx}_pearsonr.csv Pairwise Pearson correlation matrix between all location labels for a given PC bin
heatmaps/{PC}_bin{idx}_pearsonr.png (if plot=True) Annotated heatmap of the correlation matrix

About

2dShapeSpacePortable is a pipeline for quantifying the shape of cells and nuclei from 2D fluorescence microscopy images and mapping how proteins distribute within those shapes.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages