by Andreas Nold and Pietro Verzelli
NOTE: This repo is meant to reproduce the experiments in our paper [1]. For a more updated version of the FINDER algorithm please refer to this link.
Set up a python environment and install the dependencies, e.g. using the following commands:
python3 -m venv finder_env
source finder_env/bin/activate
pip install -r requirements.txtThis was tested using python version 3.8.12.
Using FINDER is really simple.
To run the FINDER algorithm on your own localization data, add
from finder import Finder
FD = Finder()
labels = FD.fit(XC)
result_ = FD.selected_parametersto your code, analogous to DBSCAN in the sklearn.cluster package.
FINDER will choose global clustering parameters according to the overall noise levels / the robustness detected in the dataset.
We created two notebooks to guide you through it:
-
synthetic_data.ipynb will guide you through the creation of synthetic datasets (based on true recordings!) in which you can control the level of noise and arange the clusters in various forms. The dataset will be later clustered using
FINDER. -
real_data.ipynb applies
FINDERto a true recording, for which the ground truth is not known.
The FINDER app allows you to apply finder to your data directly.
Make sure you have installed all the required packages (as explained above, see Installing required packages).
To use the app, simply navigate with your shell to the directory ./app inside the ./Finder folder.
Then type on the terminal:
python3 app.pyThe app should launch automatically, but if it does not simply click on the link displayed on the terminal and open it.
Using the app, you can browse your computer for the data you want to cluster and select the parameter you want to use.
FINDER will be applied to your data and the labels will be returned. For the app to work properly, your data must be saved in a text file.
-
To reproduce Figures of the manuscript "Unbiased choice of global clustering parameters in single-molecule localization microscopy", run the respective python files in the "ProduceFigures/" Subfolder. This will analyse the precomputed clustering results in the "../Data_Figures/" folder. NOTE: The working directory is assumed to be
Code/. To run the files from the command line, changesys.path.append("Modules/")tosys.path.append("../Modules/") -
To re-compute clustering results:
- copy the parameter file ending on
_Parameters.jsonfrom the subfolder of "../Data_Figures/" (eg.../Data_Figures/Results_Fig3/Results_3mers_Parameters.jsoninto the../Data_Figures/Inputfolder. Make sure only one file is in the folder. - Then go to the
Modulesfolder and runComputeSeries.pyorModules/RunAll.py.Modules/RunAll.pyprocesses all input files in the Input folder. A folder with the date of the computation will be created in the../Data_Figures/-folder (e.g.Results_2022_01_19_15_31_31_0). - To test this, leave the file
Fig3_a_3mers.jsonas only file in the input folder, then runpython3 ComputeSeries.py. This re-computes the clustering results shown in Fig. 3a. - Runtime: The example
Fig3_a_3mers.jsonruns in <2 minutes on a local machine (2,3 GHz Quad-Core Intel Core i5, 8 GB 2133 MHz LPDDR3). Running the*.jsonfiles fromResults_Fig4takes >1hr for each file. The localization-source data for Figure 5 is not included in the repository, but processed files can be found in the respective folders.
- copy the parameter file ending on
NOTE: For the purpose of reproducing the results in the manuscript, the files include the CAML-code and pre-trained models published on Gitlab, see also Williamson et al. "Machine learning for cluster analysis of localization microscopy data", Nat. Comm. (2020) .
-
We first segment the image obtained from the localizations into two a low-density region (outcell) and a high-density region (incell). We then select part of the image and perform cluster analysis with DBSCAN or DBSCANLoop for a full range of clustering parameters. For a given dataset, localizations are read from the file
XC.hdf5located in the Input folder. (e.g.TTX_24hr_2/Input/XC.hdf5. Results are saved in filesOutput/X_incell.txtandOutput/XC_outcell.hdf5.- To do this, run
Code/dash-split-cell/app.py, select parameters and save the split. - Alternatively, run
Analysis/analysis_1_DefineInOutCell.pywith the respective subfolder ofData_AnalysisOrganized, and with adapted parameters if necessary. Parameters are loaded from the fileInput/parameters_splitInOutCell.json.
- To do this, run
-
To define a ROI and run the clustering analysis, run
Analysis/analysis_2_Clustering.py -
To analyze the clustering results, run either
Dash/dash-show-clusteringorAnalysis/analysis_3_plotting.ipynb. -
An alternative, exploratory analysis is given in
Analysis/FigY1_Exploration_SingleDataset.ipynb, where a square window is analyzed. This is not optimized to work with the folders inData_AnalysisOrganized, and runs with a precomputed example inData_Other/MikeData/Analysis_dataWindow_1.
Please consider citing it:
@article{verzelli2022unbiased,
title={Unbiased choice of global clustering parameters for single-molecule localization microscopy},
author={Verzelli, Pietro and Nold, Andreas and Sun, Chao and Heilemann, Mike and Schuman, Erin M and Tchumatchenko, Tatjana},
journal={Scientific Reports},
volume={12},
number={1},
pages={22561},
year={2022},
publisher={Nature Publishing Group UK London}
}