Skip to content

Latest commit

 

History

History
62 lines (44 loc) · 2.4 KB

README.md

File metadata and controls

62 lines (44 loc) · 2.4 KB

scGPT

This is the official codebase for scGPT: Towards Building a Foundation Model for Single-Cell Multi-omics Using Generative AI.

Installation

For developing, we are using the Poetry package manager. To install Poetry, follow the instructions here.

$ git clone this-repo-url
$ cd scGPT
$ poetry install

The flash-attn dependency usually requires specific GPU and CUDA version. If you encounter any issues, please refer to the flash-attn repository for installation instructions.

Pretrained scGPT checkpoints

Please download the pretrained scGPT checkpoints from here.

Fine-tune scGPT for scRNA-seq integration

Please see our example code in examples/finetune_integration.py. By default, the script assumes the scGPT checkpoint folder stored in the examples/save directory.

To-do-list

  • Upload the pretrained model checkpoint
  • Publish to pypi
  • Provide the pretraining code with generative attention masking
  • Finetuning examples for multi-omics integration, cell tyep annotation, perturbation prediction, cell generation
  • Example code for Gene Regulatory Network analysis
  • Documentation website with readthedocs
  • Bump up to pytorch 2.0
  • New pretraining on larger datasets
  • Reference mapping example

Contributing

We greatly welcome contributions to scGPT. Please submit a pull request if you have any ideas or bug fixes. We also welcome any issues you encounter while using scGPT.

Acknowledgements

We sincerely thank the authors of following open-source projects:

Citing scGPT

@article{cui2023scGPT,
title={scGPT: Towards Building a Foundation Model for Single-Cell Multi-omics Using Generative AI},
author={Cui, Haotian and Wang, Chloe and Maan, Hassaan and Wang, Bo},
journal={bioRxiv},
year={2023},
publisher={Cold Spring Harbor Laboratory}
}