-
Notifications
You must be signed in to change notification settings - Fork 3
Introduce Keypoint Models: YoloXPose & RTMO #120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Renamed the internal activation function utility from `_get_activation_fn` to `get_activation_fn` for clarity and consistency. - Updated all references to the renamed function across the codebase to ensure proper functionality. - Enhanced the `get_activation_fn` function to accept a default activation parameter, improving flexibility in activation function selection.
- Introduced a new DarkNet backbone with configurable sizes and activation functions. - Implemented Bottleneck and C2f modules for efficient feature extraction. - Updated test cases to include DarkNet configurations, ensuring comprehensive testing across all backbone types. - Enhanced the backbone build function to support the new DarkNet model.
- Updated numpy to version 2.2.6 - Updated torch to version 2.7.1 - Updated onnx-related dependencies to versions 1.18.0 and 0.3.1 - Added DarkNet and DarkNetConfig to BackboneManager and ConfigBackboneManager respectively - Introduced tensorrt-cu12 and tensorrt-cu12-libs dependencies for enhanced GPU support These changes impact the model management system by expanding the available backbone options, thus improving the overall functionality and user experience.
This was
linked to
issues
Jul 11, 2025
- Replaced private attributes (_out_features, _out_feature_strides, _out_feature_channels) with public counterparts (out_features, out_feature_strides, out_feature_channels) across various backbone implementations. - Updated the output_shape method to utilize the new public attributes for consistency. - Introduced a new SPPF and C2f layer in the block module for enhanced feature processing. - Added a test case to validate the output_shape property across backbone types, ensuring correct shape specifications.
- Added YOLOXPoseConfig class for model configuration, including parameters for backbone, keypoints, and normalization. - Introduced KeypointCriterion class to handle various loss functions specific to keypoint detection, including Binary Cross Entropy, IoU, and OKS losses. - Developed YOLOXPose class for the model architecture, integrating backbone, pixel decoder, and head for keypoint prediction. - Created supporting data classes for keypoint targets and model outputs to streamline data handling. - Implemented utility functions for bounding box operations and non-maximum suppression to enhance model performance. - Established a comprehensive structure for the YOLOXPose model, improving modularity and maintainability.
- Added KEYPOINT detection type to the Task enum in ports.py. - Introduced keypoints attribute in FocoosDet class for keypoint detection outputs. - Updated Instances class to include areas for better instance representation. - Created RTMOConfig and RTMO classes for the new RTMO model architecture, integrating keypoint detection features. - Implemented utility classes and methods for handling keypoint targets and outputs, improving data management. - Enhanced YOLOXPose model with additional configurations for keypoint processing, ensuring compatibility with the new RTMO model.
Improve user experience by integrating keypoint detection capabilities into the dataset management system. This allows users to work with keypoint tasks seamlessly, enhancing the overall functionality of the framework. Key changes include: - Added `DatasetSplitType` to the `__init__.py` for better dataset organization. - Introduced `YOLOXPOSE` as a new model type in the `ModelFamily` enum. - Enhanced `DatasetMetadata` to support keypoint tasks by validating `thing_classes`. - Implemented `KeypointDatasetMapper` for handling keypoint data transformations. - Updated the dataset catalog to include a new dataset for COCO keypoints. - Modified dataset loading logic to accommodate keypoint tasks. These changes impact the dataset loading and processing pipeline, ensuring that keypoint annotations are handled correctly and efficiently.
…oseProcessor for improved clarity
… for clarity on feature processing flow
Improve the efficiency of image tensor conversion by introducing a new method `get_torch_batch` that handles various input formats and allows for optional resizing. This change enhances memory management and ensures consistent tensor shapes for processing. Key changes: - Replace `get_tensors` with `get_torch_batch` across multiple processor classes. - Implement optional resizing of images to a target size for better memory efficiency. Impact: These changes affect all processor classes that handle image inputs, ensuring they can now efficiently process images of varying sizes while maintaining performance. Technical details: The new `get_torch_batch` method standardizes input handling and includes resizing capabilities. It uses `torch.nn.functional.interpolate` for resizing, which optimizes memory usage during batch processing.
- Added support for the CSP backbone in the RTMO model, replacing the previous C2fDarkNet configuration. - Updated model configuration to include new parameters for transformer and CSP layers. - Introduced a new RTMOProcessor for improved image processing and tensor handling. - Added comprehensive tests for the RTMO model to ensure functionality across various configurations. These changes enhance the model's performance and flexibility, allowing for better feature extraction and processing capabilities.
…rations - Removed C2fDarkNet and related configurations from the BackboneManager and ConfigBackboneManager. - Updated keypoint training augmentations to enable cropping and scaling. - Added a new weights URI for the RTMO model configuration. - Deleted obsolete RTMO model files and refactored the model structure for improved clarity and maintainability. These changes streamline the model architecture and enhance the configuration management for better performance and usability.
…plementation - Updated CSP references in BackboneManager and ConfigBackboneManager to use "csp_darknet" for consistency. - Introduced a new CSPDarknet class with comprehensive architecture and configuration settings. - Adjusted imports in the RTMO model files to reflect the new CSPDarknet structure. - Updated test configurations to include the new CSPDarknet model. These changes enhance clarity and maintainability of the backbone architecture, ensuring a more coherent integration of the CSPDarknet model.
- Enhanced the model registry with new RTMO configurations for large, medium, and small models, including updated metrics and weights URIs.
…ations - Adjusted the NMS threshold from 0.7 to 0.65 in the large model configuration. - Updated the score threshold from 0.1 to 0.01 in both medium and small model configurations. - Enhanced the decoder.py to ensure consistent tensor concatenation using `dim=1` instead of `axis=1`. - Added a new method in the decoder for export mode, improving deployment capabilities. - Refined the RTMOHead and DCC classes with clearer documentation and improved parameter handling. These changes aim to make model exportable in ONXX format
- Enhanced the RTMO model documentation to reflect the new HybridEncoder design. - Updated model configurations to replace the default backbone with CSPConfig and added new parameters for transformer layers. - Expanded the available RTMO models, including updated metrics and FPS for small, medium, and large configurations. - Improved example usage in documentation for clarity on model inference and configuration. These changes aim to enhance the model's performance and usability in multi-person pose estimation tasks.
…urations - Added detailed latency metrics for RTMO models, including FPS, execution engines, and performance statistics for different configurations. - Introduced a new `keypoints_threshold` parameter in the inference method to filter keypoints based on confidence scores. - Updated the `.gitignore` file to exclude additional debug and IDE files.
- Removed YOLOXPOSE from the ModelFamily enum. - Added task and model family validation in the new_model method, with warnings for unsupported models. - Updated the FocoosTrainer to handle cases where model creation fails, ensuring the sync to hub is disabled if the model is not created. - Expanded the test suite to include additional RTMO model configurations. These changes improve the robustness of model handling in the Focoos platform and enhance user feedback during model creation.
- Deleted the SPPF, C2f, and Bottleneck classes from block.py to streamline the architecture. - Removed the ConvNormLayerDarknet class from conv.py, simplifying the convolutional layer structure. These changes aim to enhance code clarity and maintainability by eliminating unused components.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Key Changes
✨ Introduce keypoints models
example:
📷 Unified Inference API
Standardize infer Method Signatures
infer(image, threshold=0.5, annotate=False)and use unified image loader for infer methods (with also remote image support)example torch and exported model:
example with remote inference:
Enhanced FocoosDetections Structure
Improved serialization and memory usage
⌨️ CLI
focoos gradioto launch a Gradio interface for image and video inference using Focoos pretrained models.🕹️ Trainer
amp=True(Automatic Mixed Precision) is enabledCOSINEscheduler quadratic warmupKeypointEvaluatorVisualizer(preview hook) to save RGB images instead of BGR📖 ModelRegistry
🏞️ Processor
image_sizeinto init instead of preprocess methods📖 Docs
from focoos import xfor all exported classes and functions instead of absolute path