Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BottomUp model pipeline #52

Merged
merged 25 commits into from
Jun 26, 2024
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 15 additions & 10 deletions docs/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ The config file has four main sections:

- `data_config`:
- `provider`: (str) Provider class to read the input sleap files. Only "LabelsReader" supported for the training pipeline.
- `pipeline`: (str) Pipeline for training data. One of "TopdownConfmaps", "SingleInstanceConfmaps" or "CentroidConfmapsPipeline".
- `pipeline`: (str) Pipeline for training data. One of "TopdownConfmaps", "SingleInstanceConfmaps", "CentroidConfmapsPipeline" or "BottomUp".
- `train`:
- `labels_path`: (str) Path to `.slp` files
- `is_rgb`: (bool) True if the image has 3 channels (RGB image). If input has only one
Expand All @@ -32,6 +32,7 @@ The config file has four main sections:
- `anchor_ind`: (int) Index of the anchor node to use as the anchor point. If None, the midpoint of the bounding box of all visible instance points will be used as the anchor. The bounding box midpoint will also be used if the anchor part is specified but not visible in the instance. Setting a reliable anchor point can significantly improve topdown model accuracy as they benefit from a consistent geometry of the body parts relative to the center of the image.
- `crop_hw`: (List[int]) Crop height and width of each instance (h, w) for centered-instance model.
- `conf_map_gen`: (Dict[float]) Dictionary in the format {"sigma": 1.5, "output_stride": 2}. *sigma* defines the spread of the Gaussian distribution of the confidence maps as a scalar float. Smaller values are more precise but may be difficult to learn as they have a lower density within the image space. Larger values are easier to learn but are less precise with respect to the peak coordinate. This spread is in units of pixels of the model input image, i.e., the image resolution after any input scaling is applied. *output_stride* defines the stride of the output confidence maps relative to the input image. This is the reciprocal of the resolution, e.g., an output stride of 2 results in confidence maps that are 0.5x the size of the input. Increasing this value can considerably speed up model performance and decrease memory requirements, at the cost of decreased spatial resolution.
- `pafs_gen`: (Dict[float]) **Note**: Only for BottomUp model. The structure is same as `preprocessing.conf_map_gen`.
- `augmentation_config`:
- `random crop`: (Dict[float]) {"random_crop_p": None, "random_crop_hw": None}, where *random_crop_p* is the probability of applying random crop and *random_crop_hw* is the desired output size (out_h, out_w) of the crop. Must be Tuple[int, int], then out_h = size[0], out_w = size[1].
- `use_augmentations`: (bool) True if the data augmentation should be applied to the data, else False.
Expand Down Expand Up @@ -119,14 +120,15 @@ The config file has four main sections:
convolutions for upsampling. Interpolation is faster but transposed
convolutions may be able to learn richer or more complex upsampling to
recover details from higher scales. Default: True.
- `head_configs`
- `head_type`: (str) Name of the head. Supported values are 'SingleInstanceConfmapsHead', 'CentroidConfmapsHead', 'CenteredInstanceConfmapsHead', 'MultiInstanceConfmapsHead', 'PartAffinityFieldsHead', 'ClassMapsHead', 'ClassVectorsHead', 'OffsetRefinementHead'
- `head_config`:
- `part_names`: (List[str]) Text name of the body parts (nodes) that the head will be configured to produce. The number of parts determines the number of channels in the output. If not specified, all body parts in the skeleton will be used.
- `anchor_part`: (int) Index of the anchor node to use as the anchor point. If None, the midpoint of the bounding box of all visible instance points will be used as the anchor. The bounding box midpoint will also be used if the anchor part is specified but not visible in the instance. Setting a reliable anchor point can significantly improve topdown model accuracy as they benefit from a consistent geometry of the body parts relative to the center of the image.
- `sigma`: (float) Spread of the Gaussian distribution of the confidence maps as a scalar float. Smaller values are more precise but may be difficult to learn as they have a lower density within the image space. Larger values are easier to learn but are less precise with respect to the peak coordinate. This spread is in units of pixels of the model input image, i.e., the image resolution after any input scaling is applied.
- `output_stride`: (float) The stride of the output confidence maps relative to the input image. This is the reciprocal of the resolution, e.g., an output stride of 2 results in confidence maps that are 0.5x the size of the input. Increasing this value can considerably speed up model performance and decrease memory requirements, at the cost of decreased spatial resolution.
- `loss_weight`: (float) Scalar float used to weigh the loss term for this head during training. Increase this to encourage the optimization to focus on improving this specific output in multi-head models.
- `head_configs`: (List[dict]) List of heads in the model. For eg, BottomUp model has both 'MultiInstanceConfmapsHead' and 'PartAffinityFieldsHead' heads.
- `head_type`: (str) Name of the head. Supported values are 'SingleInstanceConfmapsHead', 'CentroidConfmapsHead', 'CenteredInstanceConfmapsHead', 'MultiInstanceConfmapsHead', 'PartAffinityFieldsHead', 'ClassMapsHead', 'ClassVectorsHead', 'OffsetRefinementHead'
- `head_config`:
- `part_names`: (List[str]) Text name of the body parts (nodes) that the head will be configured to produce. The number of parts determines the number of channels in the output. If not specified, all body parts in the skeleton will be used. This config does not apply for 'PartAffinityFieldsHead'.
- `edges`: (List[str]) **Note**: Only for 'PartAffinityFieldsHead'. List of indices `(src, dest)` that form an edge.
- `anchor_part`: (int) **Note**: Only for 'CenteredInstanceConfmapsHead'. Index of the anchor node to use as the anchor point. If None, the midpoint of the bounding box of all visible instance points will be used as the anchor. The bounding box midpoint will also be used if the anchor part is specified but not visible in the instance. Setting a reliable anchor point can significantly improve topdown model accuracy as they benefit from a consistent geometry of the body parts relative to the center of the image.
- `sigma`: (float) Spread of the Gaussian distribution of the confidence maps as a scalar float. Smaller values are more precise but may be difficult to learn as they have a lower density within the image space. Larger values are easier to learn but are less precise with respect to the peak coordinate. This spread is in units of pixels of the model input image, i.e., the image resolution after any input scaling is applied.
- `output_stride`: (float) The stride of the output confidence maps relative to the input image. This is the reciprocal of the resolution, e.g., an output stride of 2 results in confidence maps that are 0.5x the size of the input. Increasing this value can considerably speed up model performance and decrease memory requirements, at the cost of decreased spatial resolution.
- `loss_weight`: (float) Scalar float used to weigh the loss term for this head during training. Increase this to encourage the optimization to focus on improving this specific output in multi-head models.

- `trainer_config`:
- `train_data_loader`:
Expand Down Expand Up @@ -202,7 +204,10 @@ The config file has four main sections:
- `anchor_ind`: (int) Index of the anchor node to use as the anchor point. If None, the midpoint of the bounding box of all visible instance points will be used as the anchor. The bounding box midpoint will also be used if the anchor part is specified but not visible in the instance. Setting a reliable anchor point can significantly improve topdown model accuracy as they benefit from a consistent geometry of the body parts relative to the center of the image.
- `crop_hw`: (List[int]) Crop height and width of each instance (h, w) for centered-instance model.
- `output_stride`: (int) Stride of the output confidence maps relative to the input image. This is the reciprocal of the resolution, e.g., an output stride of 2 results in confidence maps that are 0.5x the size of the input. Increasing this value can considerably speed up model performance and decrease memory requirements, at the cost of decreased spatial resolution.
- `pafs_output_stride`: (int) Stride of the output part affinity fields relative to the input image.
- `peak_threshold`: `float` between 0 and 1. Minimum confidence threshold. Peaks with values below this will be ignored.
- `integral_refinement`: If `None`, returns the grid-aligned peaks with no refinement. If `"integral"`, peaks will be refined with integral regression.
- `integral_patch_size`: Size of patches to crop around each rough peak as an integer scalar.
- `return_confmaps`: If `True`, predicted confidence maps will be returned along with the predicted peak values and points.
- `return_confmaps`: If `True`, predicted confidence maps will be returned along with the predicted peak values and points.
- `return_pafs`: If `True`, predicted part affinity fields will be returned along with the predicted peak values and points.
- `return_paf_graph`: If `True`, the part affinity field graph will be returned together with the predicted instances.
270 changes: 270 additions & 0 deletions docs/config_bottomup.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,270 @@
data_config:
provider: LabelsReader
pipeline: BottomUp
train:
labels_path: minimal_instance.pkg.slp
max_width: null
max_height: null
scale: 1.0
is_rgb: false
preprocessing:
anchor_ind: 0
crop_hw:
- 160
- 160
conf_map_gen:
sigma: 1.5
output_stride: 2
pafs_gen:
sigma: 50
output_stride: 4
augmentation_config:
random_crop:
random_crop_p: 0
random_crop_hw:
- 160
- 160
use_augmentations: true
augmentations:
intensity:
uniform_noise:
- 0.0
- 0.04
uniform_noise_p: 0
gaussian_noise_mean: 0.02
gaussian_noise_std: 0.004
gaussian_noise_p: 0
contrast:
- 0.5
- 2.0
contrast_p: 0
brightness: 0.0
brightness_p: 0
geometric:
rotation: 180.0
scale: 0
translate:
- 0
- 0
affine_p: 0.5
erase_scale:
- 0.0001
- 0.01
erase_ratio:
- 1
- 1
erase_p: 0
mixup_lambda: null
mixup_p: 0
val:
labels_path: minimal_instance.pkg.slp
max_width: null
max_height: null
is_rgb: false
scale: 1.0
preprocessing:
anchor_ind: 0
crop_hw:
- 160
- 160
conf_map_gen:
sigma: 1.5
output_stride: 2
pafs_gen:
sigma: 50
output_stride: 4
augmentation_config:
random_crop:
random_crop_p: 0
random_crop_hw:
- 160
- 160
use_augmentations: false
augmentations:
intensity:
uniform_noise:
- 0.0
- 0.04
uniform_noise_p: 0
gaussian_noise_mean: 0.02
gaussian_noise_std: 0.004
gaussian_noise_p: 0
contrast:
- 0.5
- 2.0
contrast_p: 0
brightness: 0.0
brightness_p: 0
geometric:
rotation: 180.0
scale: 0
translate:
- 0
- 0
affine_p: 0.5
erase_scale:
- 0.0001
- 0.01
erase_ratio:
- 1
- 1
erase_p: 0
mixup_lambda: null
mixup_p: 0
model_config:
init_weights: xavier
pre_trained_weights: null
backbone_config:
backbone_type: unet
backbone_config:
in_channels: 1
kernel_size: 3
filters: 16
filters_rate: 2
max_stride: 16
convs_per_block: 2
stacks: 1
stem_stride: null
middle_block: true
up_interpolate: true
output_strides:
- [2, 4]
block_contraction: false

# pre_trained_weights: ConvNeXt_Tiny_Weights
# backbone_config:
# backbone_type: convnext
# backbone_config:
# in_channels: 1
# model_type: tiny
# arch:
# kernel_size: 3
# filters_rate: 2
# convs_per_block: 2
# up_interpolate: True
# output_strides: [2, 4]
# stem_patch_kernel: 4
# stem_patch_stride: 2

# pre_trained_weights: Swin_T_Weights
# backbone_config:
# backbone_type: swint
# backbone_config:
# in_channels: 1
# model_type: tiny
# arch:
# patch_size: [4,4]
# window_size: [7,7]
# kernel_size: 3
# filters_rate: 2
# convs_per_block: 2
# up_interpolate: True
# output_strides: [2, 4]
# stem_patch_stride: 2

head_configs:
- head_type: MultiInstanceConfmapsHead
head_config:
part_names:
- '0'
- '1'
sigma: 1.5
output_stride: 2
loss_weight: 1.0
- head_type: PartAffinityFieldsHead
head_config:
edges:
- - '0'
- '1'
sigma: 50
output_stride: 4
loss_weight: 1.0
trainer_config:
train_data_loader:
batch_size: 4
shuffle: true
num_workers: 2
pin_memory: true
drop_last: false
val_data_loader:
batch_size: 4
shuffle: false
num_workers: 2
pin_memory: true
drop_last: false
model_ckpt:
save_top_k: 1
save_last: true
monitor: val_loss
mode: min
auto_insert_metric_name: false
early_stopping:
stop_training_on_plateau: true
min_delta: 1.0e-08
patience: 20
device: cpu
trainer_devices: 1
trainer_accelerator: cpu
enable_progress_bar: false
steps_per_epoch: null
max_epochs: 50
seed: 1000
use_wandb: false
save_ckpt: true
save_ckpt_path: min_inst_bottomup1
wandb:
entity: team-ucsd
project: test_centroid_centered
name: fly_unet_centered
wandb_mode: ''
api_key: ''
log_params:
- trainer_config.optimizer_name
- trainer_config.optimizer.amsgrad
- trainer_config.optimizer.lr
- model_config.backbone_config.backbone_type
- model_config.init_weights
optimizer_name: Adam
optimizer:
lr: 0.0001
amsgrad: false
lr_scheduler:
threshold: 1.0e-07
cooldown: 3
patience: 5
factor: 0.5
min_lr: 1.0e-08
inference_config:
device: cpu
data:
path: ./tests/assets/minimal_instance.pkg.slp
max_instances: 6
max_width: null
max_height: null
is_rgb: false
scale: 1.0
provider: LabelsReader
data_loader:
batch_size: 4
shuffle: false
num_workers: 2
pin_memory: true
drop_last: false
video_loader:
batch_size: 4
queue_maxsize: 8
start_idx: 0
end_idx: 100
preprocessing:
anchor_ind: 0
crop_hw:
- 160
- 160
output_stride: 2
pafs_output_stride: 4
peak_threshold: 0.3
integral_refinement: integral
integral_patch_size: 5
return_confmaps: false
return_pafs: false
return_paf_graph: false
8 changes: 4 additions & 4 deletions docs/config_centroid.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ data_config:
labels_path: "minimal_instance.pkg.slp"
max_width:
max_height:
scale: 1.0
scale: 0.5
is_rgb: False
preprocessing:
anchor_ind: 0
Expand Down Expand Up @@ -58,7 +58,7 @@ data_config:
max_width:
max_height:
is_rgb: False
scale: 1.0
scale: 0.5
gitttt-1234 marked this conversation as resolved.
Show resolved Hide resolved
preprocessing:
anchor_ind: 0
crop_hw:
Expand Down Expand Up @@ -226,7 +226,7 @@ inference_config:
max_width:
max_height:
is_rgb: False
scale: 1.0
scale: 0.5
provider: LabelsReader
data_loader:
batch_size: 4
Expand All @@ -245,7 +245,7 @@ inference_config:
- 160
- 160
output_stride: 2
peak_threshold: 0.0
peak_threshold: 0.5
integral_refinement: integral
integral_patch_size: 5
return_confmaps: false
9 changes: 9 additions & 0 deletions sleap_nn/architectures/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,8 @@ def __init__(
head = get_head(head_config.head_type, head_config.head_config)
self.heads.append(head)

min_output_stride = min(backbone_config.backbone_config.output_strides)
strides = self.backbone.dec.current_strides
self.head_layers = nn.ModuleList([])
for head in self.heads:
in_channels = int(
Expand All @@ -141,6 +143,13 @@ def __init__(
** len(self.backbone.dec.decoder_stack)
)
)
if head.output_stride != min_output_stride:
factor = strides.index(min_output_stride) - strides.index(
head.output_stride
)
in_channels = in_channels * (
self.backbone_config.backbone_config.filters_rate**factor
)
self.head_layers.append(head.make_head(x_in=int(in_channels)))

@classmethod
Expand Down
Loading
Loading