diff --git a/README.md b/README.md
index 01d64b2..e93129a 100644
--- a/README.md
+++ b/README.md
@@ -119,18 +119,33 @@ A REPIC Docker image is published on both [DockerHub](https://hub.docker.com/r/c
2. [Install and set up NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) for building and running GPU-accelerated containers
3. Pull REPIC Docker image and convert to Singularity image format (SIF) (requires >8 Gb of memory and ~40 mins for conversion):
```
-apptainer pull docker://cjfcameron/repic:main
+apptainer pull docker://cjfcameron/repic
```
If SIF file creation is taking a long time, increase the ```mksquashfs mem``` parameter in the Apptainer config file (apptainer.conf). See [here](https://apptainer.org/docs/admin/1.0/configfiles.html) for more information.
4. Run container with GPU acceleration (example [iter_pick.py](https://github.com/ccameron/REPIC/blob/main/repic/commands/iter_pick.py) command shown below):
```
-apptainer run --nv --bind
** + +A [Jupyter Notebook](https://jupyter.org/) for installing and running REPIC on [Google Colab](https://colab.google/) is included in the REPIC GitHub repo: [repic_colab.ipynb](https://github.com/ccameron/REPIC/blob/main/repic_colab.ipynb) + +To open the notebook in Google Colab: + +1. Navigate a browser to: https://colab.google/ +2. Select "Open Colab", then "GitHub" +3. Enter the REPIC GitHub web URL: https://github.com/ccameron/REPIC.git +4. Select the "repic_colab.ipynb" Jupyter Notebook + +
**
+:warning: **WARNING**: Scipion plugin currently only contains REPIC one-shot mode
+
REPIC is available as a [Scipion](https://scipion.i2pc.es/) plugin: [https://github.com/scipion-em/scipion-em-repic](https://github.com/scipion-em/scipion-em-repic)
See [here](https://scipion-em.github.io/docs/release-3.0.0/docs/scipion-modes/how-to-install.html#installing-other-plugins) for information about installing plugins for Scipion.
@@ -188,7 +203,7 @@ repic iter_config examples/10057/ 176 224 \n",
+ "\n",
+ "## Welcome to the official REPIC Jupyter Notebook for Google Colab!\n",
+ "\n",
+ "REliable PIcking by Consensus (REPIC) is an ensemble learning approach to cryogenic-electron microscopy (cryo-EM) particle picking. It identifies particles common to multiple picked particle sets (i.e., consensus particles) using graph theory and integer linear programming.\n",
+ "\n",
+ "Development: https://github.com/ccameron/REPIC\n",
+ "\n",
+ "Documentation: https://repic.readthedocs.io/en/latest/\n",
+ "\n",
+ "\n",
+ "Below are the steps required to install and run both the one-shot and iterative modes of REPIC. Note — GPUs are recommended for the iterative mode (Runtime > Change runtime type > GPU)\n",
+ "\n",
+ "\n",
+ "If you encounter any issues with this notebook, please raise an issue in the REPIC GitHub repo: https://github.com/ccameron/REPIC/issues\n",
+ "\n",
+ "Author: Christopher JF Cameron (2023)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "T9fR5eWuEukn"
+ },
+ "source": [
+ "### Step 1 — check availability of ```conda``` command"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "1. Check the availability of the ```conda``` command"
+ ],
+ "metadata": {
+ "id": "wGrJRzHOXV9h"
+ }
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "32bKA92WGzVB"
+ },
+ "outputs": [],
+ "source": [
+ "! conda --version"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "pp88FSKJx57U"
+ },
+ "source": [
+ "If Conda is available, the output will be (may be a different version): ```conda 23.1.0 ```\n",
+ "\n",
+ "\n",
+ "If the output says ```conda: command not found```, install Conda via [condalab](https://github.com/conda-incubator/condacolab) using pip: https://pypi.org/project/condacolab/"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "OGM8LhdwBzha"
+ },
+ "outputs": [],
+ "source": [
+ "! pip install -q condacolab\n",
+ "import condacolab\n",
+ "condacolab.install()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "zbn7scSq2J0Z"
+ },
+ "source": [
+ "After installation, Colab will raise a warning that the kernel has crashed (e.g., \"Your session crashed for an unknown reason.\"). This crash is expected and a result of the ```Restarting kernel...``` condalab step.\n",
+ "\n",
+ "2. Update the Conda base environment"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "eQwg-4rI2Lwa"
+ },
+ "outputs": [],
+ "source": [
+ "! conda update -n base -c conda-forge conda"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "4YnZioehsqVm"
+ },
+ "source": [
+ "### Step 2 — install particle picking algorithms"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "1. Install Mamba package manager: https://github.com/mamba-org/mamba"
+ ],
+ "metadata": {
+ "id": "hhOskLGxU_LT"
+ }
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "PfSR9nVlsv1x"
+ },
+ "outputs": [],
+ "source": [
+ "! conda install -c conda-forge mamba -y"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "a8GJ5ozA3Fjj"
+ },
+ "source": [
+ "2. Get CUDA version of GPU using NVIDIA system manager interface (exp v12.0): https://developer.nvidia.com/nvidia-system-management-interface"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "rYSjQ1XHtOlN"
+ },
+ "outputs": [],
+ "source": [
+ "! nvidia-smi"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "iLgIXM3K3pkH"
+ },
+ "source": [
+ "3. Get version of installed CUDA toolkit (exp v11.8)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "PLV-cmEX2wIy"
+ },
+ "outputs": [],
+ "source": [
+ "! nvcc --version"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "lWo6sm2ytXwr"
+ },
+ "source": [
+ "_(Optional)_ Update the following install instructions for particle picking algorithms (Steps 4-11) below to reflect the available CUDA versions\n",
+ "\n",
+ "**SPHIRE-crYOLO install**\n",
+ "\n",
+ "4. Install SPHIRE-crYOLO: https://cryolo.readthedocs.io/en/stable/"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "Kid4JYd8tZoF"
+ },
+ "outputs": [],
+ "source": [
+ "! mamba create -n cryolo -c conda-forge -c anaconda pyqt=5 python=3.7 cudatoolkit=10.0.130 cudnn=7.6.5 numpy=1.18.5 libtiff wxPython=4.1.1 adwaita-icon-theme -y\n",
+ "! source activate cryolo && pip install 'cryolo[gpu]'"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "5. Download pre-trained crYOLO LOWPASS model"
+ ],
+ "metadata": {
+ "id": "PhPnf9dmIrz4"
+ }
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "vEGY8V6kvL7l"
+ },
+ "outputs": [],
+ "source": [
+ "! wget ftp://ftp.gwdg.de/pub/misc/sphire/crYOLO-GENERAL-MODELS/gmodel_phosnet_202005_N63_c17.h5"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Zz_Go0O4wHLh"
+ },
+ "source": [
+ "**Topaz install**\n",
+ "6. Install Topaz: https://topaz-em.readthedocs.io/en/latest/?badge=latest"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "aKNx4P6ZwLf3"
+ },
+ "outputs": [],
+ "source": [
+ "! mamba create -n topaz -c tbepler -c nvidia -c pytorch pytorch torchvision torchaudio pytorch-cuda=11.8 topaz=0.2.5"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "FlcvP5DRyVKc"
+ },
+ "source": [
+ "**REPIC install**\n",
+ "7. Install REPIC"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "! mamba install -c bioconda repic -y"
+ ],
+ "metadata": {
+ "id": "5vliH-nvP8U2"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "8. Install Linux basic calculator (bc) package: https://linux.die.net/man/1/bc"
+ ],
+ "metadata": {
+ "id": "Pde-ahNZdATb"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "! apt-get install bc"
+ ],
+ "metadata": {
+ "id": "_z8yIMSMb4nr"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "**DeepPicker install**\n",
+ "9. Clone DeepPicker GitHub repository"
+ ],
+ "metadata": {
+ "id": "SYanXcvhQT93"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "! git clone https://github.com/nejyeah/DeepPicker-python.git"
+ ],
+ "metadata": {
+ "id": "p5MIguDOQS03"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "10. Apply REPIC patch to DeepPicker Python scripts"
+ ],
+ "metadata": {
+ "id": "hK6RhotWQ4oN"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "! cp $(pip show repic | grep -in \"Location\" | cut -f2 -d ' ')/../../../docs/patches/deeppicker/*.py DeepPicker-python/"
+ ],
+ "metadata": {
+ "id": "nfFH2QR2Q_HG"
+ },
+ "execution_count": 17,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "11. Create DeepPicker Conda environment"
+ ],
+ "metadata": {
+ "id": "gdaQWl0eRMEB"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "! mamba create -n deep -c anaconda -c conda-forge -c pytorch python=3.7 tensorflow-gpu=2.4.1 cudatoolkit=10.1.243 cudnn=7.6.5 scipy matplotlib scikit-image mrcfile pytorch torchvision"
+ ],
+ "metadata": {
+ "id": "vRkOYkN4ROr3"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "**Cleanup**\n",
+ "\n",
+ "12. Check that Conda environments can be found (cryolo, deep, topaz)"
+ ],
+ "metadata": {
+ "id": "8rc0qlcSUqRH"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "! conda env list"
+ ],
+ "metadata": {
+ "id": "MaytIvvlUufU"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "13. Clean up installation files"
+ ],
+ "metadata": {
+ "id": "RKR_GupWRfDa"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "! conda clean --all -y\n",
+ "! mamba clean --all -y"
+ ],
+ "metadata": {
+ "id": "KpaBw8y0Sm02"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "JawwDI-OFSC_"
+ },
+ "source": [
+ "### Step 3 — run one-shot REPIC on example data _(optional)_"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "-_b7a6NvFZi3"
+ },
+ "source": [
+ "1. Clone the REPIC GitHub repo and obtain example data"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "SodLgQG8Fqwv"
+ },
+ "outputs": [],
+ "source": [
+ "! git clone https://github.com/ccameron/REPIC"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "EZcATLbkF7Fu"
+ },
+ "source": [
+ "2. Calculate particle overlap and enumerate cliques using [get_cliques](https://github.com/ccameron/REPIC/blob/main/repic/commands/get_cliques.py).py (expected run time: <1 min)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "TzZavdHeGRyr"
+ },
+ "outputs": [],
+ "source": [
+ "! repic get_cliques /content/REPIC/examples/10017/ /content/REPIC/examples/10017/clique_files/ 180"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Ub3K1h6rG7tM"
+ },
+ "source": [
+ "3. Find optimal cliques and create consensus particle files using [run_ilp.py](https://github.com/ccameron/REPIC/blob/main/repic/commands/run_ilp.py) (expected run time: <1 min)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "o0aGn5NwHVi1"
+ },
+ "outputs": [],
+ "source": [
+ "! repic run_ilp /content/REPIC/examples/10017/clique_files/ 180"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "X68pqVI6Hlqw"
+ },
+ "source": [
+ "A consensus particle coordinate file (in BOX format) for each micrograph will be produced in ```/content/REPIC/examples/10017/clique_files/```"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "### Step 4 — run iterative REPIC on example data _(optional)_"
+ ],
+ "metadata": {
+ "id": "-c66wgxGVysJ"
+ }
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "1. Download example data"
+ ],
+ "metadata": {
+ "id": "FHIYnBe-V5kp"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "! bash $(pip show repic | grep -in \"Location\" | cut -f2 -d ' ')/repic/iterative_particle_picking/get_examples.sh /content/REPIC/examples/10057/data/ &> aws_download.log"
+ ],
+ "metadata": {
+ "id": "_iv2gzKBW9sO"
+ },
+ "execution_count": 34,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "2. Create a configuration file for iterative ensemble particle picking using [iter_config.py](https://github.com/ccameron/REPIC/blob/main/repic/commands/iter_config.py) (expected run time: <1 min)"
+ ],
+ "metadata": {
+ "id": "STcRVFE-XwwY"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "! repic iter_config /content/REPIC/examples/10057/ 176 224 /content/gmodel_phosnet_202005_N63_c17.h5 /content/DeepPicker-python 4 22"
+ ],
+ "metadata": {
+ "id": "YnApxf-LXhiO"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "3. Pick particles by iterative ensemble learning using [iter_pick.py](https://github.com/ccameron/REPIC/blob/main/repic/commands/iter_pick.py), a wrapper of [run.sh](https://github.com/ccameron/REPIC/blob/main/repic/iterative_particle_picking/run.sh) (expected run time: 20-30 min/iteration):"
+ ],
+ "metadata": {
+ "id": "vFZkBWgeX9Tb"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "! repic iter_pick /content/iter_config.json 4 100"
+ ],
+ "metadata": {
+ "id": "zvLfFeO5YHTo"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "The final set of consensus particles for the testing set will be found in: ```/content/REPIC/examples/10057/iterative_particle_picking/round_4/train_100/clique_files/test/*.box```"
+ ],
+ "metadata": {
+ "id": "G-eth_OR0wEV"
+ }
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "jiYc217QJgaZ"
+ },
+ "source": [
+ "### Step 5 — run REPIC on your data"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "XLX_XSDYJlov"
+ },
+ "source": [
+ "**REPIC overview**\n",
+ "\n",
+ "REPIC can be run in one of two modes:\n",
+ "\n",
+ "1. One shot — find consensus particles from the output of multiple particle picking algroithms\n",
+ "2. Iterative — _ab-initio_ train an ensemble of particle picking algorithms using either one-shot REPIC output or manually picked particles\n",
+ "\n",
+ "One-shot REPIC expects particle coordinate files (in BOX format) to be stored in separate subfolders for each particle picking algorithm:\n",
+ "\n",
+ "```console\n",
+ "folder/\n",
+ " | particle_picking_algorithm_1/\n",
+ " | mrc001.box\n",
+ " | mrc002.box\n",
+ " ⋮\n",
+ " | mrc999.box\n",
+ " | particle_picking_algorithm_2/\n",
+ " | mrc001.box\n",
+ " | mrc002.box\n",
+ " ⋮\n",
+ " | mrc999.box\n",
+ " ⋮\n",
+ " | particle_picking_algorithm_n/\n",
+ " | mrc001.box\n",
+ " | mrc002.box\n",
+ " ⋮\n",
+ " | mrc999.box\n",
+ "```\n",
+ "\n",
+ "Iterative REPIC expects micrograph (MRC) and particle coordinate (if using manually picked particles) files to be found in a singular folder:\n",
+ "```console\n",
+ "folder/\n",
+ " | mrc001.mrc\n",
+ " | mrc001.box\n",
+ " | mrc002.mrc\n",
+ " | mrc002.box\n",
+ " ⋮\n",
+ " | mrc999.mrc\n",
+ " | mrc999.box\n",
+ "```\n",
+ "\n",
+ "REPIC will create training and validation subsets from files in ```folder/``` before performing _ab-initio_ model training.\n",
+ "\n",
+ "**Uploading data to Google Colab - zipped folder**\n",
+ "\n",
+ "1. Upload a zipped folder of picked particles:\n",
+ " \n",
+ " Click folder icon in left panel of Colab (\"Files\") > Upload to session storage\n",
+ "\n",
+ "2. Unzip folder:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "thh5FNOIL3_h"
+ },
+ "outputs": [],
+ "source": [
+ "! unzip