Skip to content

File Descriptions

Gabcomolet edited this page Aug 2, 2024 · 4 revisions

File Descriptions

Here are descriptions of the files included in the ScaleFEx git repository

Main Directory

Local-Specific Files

  1. scalefex_main.py: Contains main class for running the ScaleFEx pipeline using user-specified parameters (see Other Files: parameters.yaml)
  2. scalefex_utils.py: Contains helper functions for running main pipeline, including:
    • a class that parallelizes feature extraction tasks via multiprocessing,
    • a parameter-validation function to ensure the user's parameters are formatted correctly,
    • a helper function to import local modules

Testing via pytest:

  1. tests/parameters_test.yaml: parameters file used for unit testing on the provided sample dataset
  2. tests/test_scalefex_main.py: tests for main pipeline class in scalefex_main.py
  3. tests/test_scalefex_utils.py: tests for helper functions in scalefex_utils.py

AWS-Specific Files

  1. AWS_scalefex_main.py: Contains main class for running the master ScaleFEx pipeline on AWS
  2. AWS_scalefex_extraction.py: Contains main class for running the ScaleFEx pipeline on each worker machines on AWS
  3. AWS_requirements.txt: Contains all python libraries to be installed on each virtual machine on AWS

Templates/:

  1. CloudFormationS3ExecutionPolicy.json: Permission json for AWS ScaleFEx user
  2. Infrastructure.yaml: Cloudformation template to deploy to build security infra used by the machines
  3. ScaleFEx_init.yaml: Cloudformation template to deploy to run ScaleFEx

Other Files

  1. parameters.yaml: Parameters file for running ScaleFEx (to be modified by the user)
  2. pytest.ini: Testing initialization file
  3. setup.py: script for installing main ScaleFEx package via pip
  4. __init__.py: required file for package installation
  5. .gitignore: file detailing local files to ignore in cloned repository
  6. sample_data/: testing data for ensuring the pipeline runs as expected, sample outputs, and intermediate outputs saved for testing

Modules (separated by function)

Each of these module directories contains:

  1. README.md: description of functionality of code within directory
  2. setup.py: script for installing module via pip
  3. __init__.py: required file for local namespace package installation
  4. tests/: unit tests and testing data each function implemented in the main .py files

This directory contains code for segmenting nuclei from the stained nuclei channel (e.g., DAPI, Hoechst)

  1. nuclei_location_extraction.py: contains functions for computing a binary segmentation mask of nuclei and extracting their locations within each image

This directory contains code for extracting ScaleFEx features from a single-cell, multi-channel image crop

  1. compute_ScaleFEx.py: contains class for extracting all ScaleFEx features from a single-cell, multi-channel image crop
  2. compute_measurements_functions.py: contains individual functions for extracting various types of features and aggregating them into a usable output for the class in compute_ScaleFEx.py

This directory contains code for querying data from a local directory or an AWS S3 bucket, depending on the user's specfication

  1. query_functions_local.py: functions for querying data stored locally, as well as helper functions for loading and preprocessing images
  2. query_functions_AWS.py: functions for querying data stored on an S3 bucket, as well as helper functions for loading and preprocessing images

This directory contains code for computing site-level image quality metrics to help users assess potential issues with their dataset (e.g., blur, exposure, etc)

  1. compute_global_values.py: contains code for:
    • computing image statistics for quality assessment of images (e.g., SNR, Blur, intensity, neurite lengths)
    • computing skeletons of an image with neurites (to record "neural_len" QC metric)

More Details

More detailed descriptions of these modules can be found in their individual README.md files

Clone this wiki locally