Skip to content

Liao-Xu/TCRfoundation

Repository files navigation

TCRfoundation

Documentation Status PyPI version PyPI Downloads License: MIT Python 3.8+

A multimodal foundation model for single-cell immune profiling

Overview

TCRfoundation integrates gene expression and TCR sequences (α and β chains) from paired single-cell measurements through self-supervised pretraining with masked reconstruction and cross-modal contrastive learning.

Input and Pretraining Architecture

Gene expression profiles are encoded through feed-forward layers with multi-head attention, while TCR sequences are tokenized and processed through transformer blocks. The fused representations are learned via three objectives: masked gene expression reconstruction, masked TCR sequence reconstruction, and cross-modal alignment.

Input and Pretraining

Fine-tuning Tasks

The pretrained model supports three downstream applications:

  • T-cell state classification: Predict tissue origin, disease state, and cellular phenotype
  • Binding specificity detection: Identify TCR-antigen interactions and quantify binding avidity
  • Cross-modal prediction: Infer gene expression from TCR sequences

Fine-tuning Tasks

Installation

From PyPI (Recommended)

pip install tcrfoundation

From Source

git clone https://github.com/Liao-Xu/TCRfoundation.git
cd TCRfoundation
pip install -e .

Requirements: Python 3.8+, PyTorch 1.10.0+

Quick Start

import tcrfoundation as tcrf
import scanpy as sc

# Load your data
adata = sc.read("your_data.h5ad")

# Pretrain the foundation model
model, history = tcrf.pretrain.train(
    adata,
    epochs=500,
    batch_size=2048,
    save_dir='models/'
)

# Fine-tune for classification
results, adata_new = tcrf.finetune.classification.train_classifier(
    adata,
    label_column="cell_type",
    checkpoint_path="models/foundation_model_best.pt",
    num_epochs=50
)

Documentation

Tutorials

Complete Jupyter notebook tutorials are available:

  1. Pretraining - Train the foundation model
  2. Classification - T cell state classification
  3. Specificity - Antigen specificity prediction
  4. Avidity - Binding avidity regression
  5. Cross-modal - TCR-to-gene prediction

Contact

About

A multimodal foundation model for single-cell immune profiling

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published