Train an RTMPose model on your own dataset with a custom keypoint schema. This workspace includes a complete example for 18 keypoints (ALF pose data), from raw YOLO labels through training and inference.
Annotate your data: Use Keypoints Studio — a desktop GUI for building YOLO pose datasets (custom keypoint schemas, bbox + keypoint editing, YOLO-pose .txt export).
| Item | Value |
|---|---|
| Model | RTMPose-M (CSPNeXt backbone) |
| Keypoints | 18 (custom schema) |
| Input size | 256×192 (default) or 448×448 |
| Epochs | 200 (stage-2 pipeline from epoch 180) |
| Data format | COCO JSON (top-down, single person per image) |
Pipeline summary
Raw images + YOLO labels → COCO JSON + folder layout → Train RTMPose → Inference
mmpose/
├── DataPreprocessing/
│ ├── custom18_keypoints.py # Keypoint names, skeleton, count
│ └── prepare_dataset.py # YOLO → COCO conversion script
├── configs/
│ ├── _base_/datasets/custom18.py # MMPose dataset metainfo
│ └── body_2d_keypoint/rtmpose/custom/
│ ├── rtmpose-m_8xb64-200e_custom18-256x192.py
│ └── rtmpose-m_8xb64-200e_custom18-448x448.py
├── data/custom_pose/ # Prepared dataset (created by script)
├── tools/train.py # Training entry point
├── demo/ # Inference demos
└── work_dirs/ # Checkpoints and logs
- Windows 10/11 (64-bit)
- NVIDIA GPU + driver (CUDA recommended)
- Python 3.10 via Conda
conda create -n rtmpose python=3.10 -y
conda activate rtmpose
python -m pip install -U pip
# PyTorch (CUDA 11.8)
pip install torch==2.1.2 torchvision==0.16.2 --index-url https://download.pytorch.org/whl/cu118
# Core dependencies (pinned for stability)
pip install "numpy==1.26.4" "setuptools<81" "opencv-python==4.11.0.86"
pip install -U openmim
pip install mmengine==0.10.7 mmcv==2.1.0 mmdet==3.3.0
pip install xtcocotools==1.14.3 --no-build-isolation --no-cache-dir
pip install albumentations==1.4.24 charset-normalizer
# Install this repo
cd path\to\mmpose
pip install -e .Note: Keep
numpy<2. Upgrading NumPy to 2.x breaksxtcocotoolsand other compiled extensions.
Edit DataPreprocessing/custom18_keypoints.py to match your annotation schema.
| Field | Purpose |
|---|---|
NUM_KEYPOINTS |
Total keypoint count |
KEYPOINT_NAMES |
Ordered list of keypoint names (index = annotation order) |
SKELETON_LINKS |
Pairs of keypoint indices for visualization and metainfo |
Example (18 keypoints):
right_hand, left_hand, right_elbow, left_elbow,
right_shoulder, left_shoulder, Trachea, nose, forehead,
right_hip, left_hip, sacrum,
right_knee, left_knee, right_ankle, left_ankle,
right_footindex, left_footindex
The same definition must be reflected in configs/_base_/datasets/custom18.py (MMPose metainfo: names, colors, skeleton, swap pairs for horizontal flip).
Label images and export YOLO-pose .txt files using Keypoints Studio:
- Custom keypoint schemas (add / remove / reorder points)
- Bounding box + keypoint editing with preview
- Export YOLO-pose labels alongside images
Place the exported folders under your source directory (default: alf_data/).
Place images and labels under a source folder (default: alf_data/):
alf_data/
└── <sequence_name>/
├── image001.jpg
├── image001.txt
└── ...
Each .txt line is one person:
class x_center y_center width height (kpt_x kpt_y visibility) × N
- Coordinates are normalized (0–1) relative to image size.
visibility:0= not labeled,1= occluded,2= visible.- Detection-only lines (5 values, no keypoints) are skipped.
python DataPreprocessing/prepare_dataset.py --src alf_data --out data/custom_pose --val-ratio 0.1Output:
data/custom_pose/
├── images/train/
├── images/val/
├── labels/train/
├── labels/val/
├── annotations/custom_train.json
├── annotations/custom_val.json
└── dataset_stats.json
Use --dry-run to scan and print statistics without writing files.
MMPose reads keypoint metadata from configs/_base_/datasets/custom18.py. This file defines:
keypoint_info— name, id, color, left/right swap partnerskeleton_info— bone connections for visualizationjoint_weightsandsigmas— used in loss and COCO evaluation
When you change the number or order of keypoints, update both:
DataPreprocessing/custom18_keypoints.pyconfigs/_base_/datasets/custom18.py
And set num_keypoints / out_channels in the training config.
Two configs are provided:
| Config | Input size (W×H) | Use case |
|---|---|---|
rtmpose-m_8xb64-200e_custom18-256x192.py |
192 × 256 | Default, faster, less VRAM |
rtmpose-m_8xb64-200e_custom18-448x448.py |
448 × 448 | Higher resolution, more VRAM |
Key settings in the config:
| Setting | Default | Description |
|---|---|---|
max_epochs |
200 | Total training epochs |
stage2_num_epochs |
20 | Last N epochs use a lighter augmentation pipeline |
val_interval |
10 | Validate every N epochs |
data_root |
data/custom_pose/ |
Prepared dataset path |
num_keypoints |
18 | Must match your schema |
The backbone loads a COCO-pretrained CSPNeXt-M checkpoint. Only the pose head is trained for your 18 keypoints.
python -c "import os,urllib.request; u='https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/cspnext-m_udp-aic-coco_210e-256x192-f2f7d6f6_20230130.pth'; os.makedirs('checkpoints',exist_ok=True); urllib.request.urlretrieve(u, 'checkpoints/'+os.path.basename(u))"Use the environment Python directly (more reliable than mim on Windows):
cd path\to\mmpose
python tools/train.py configs/body_2d_keypoint/rtmpose/custom/rtmpose-m_8xb64-200e_custom18-256x192.py ^
--work-dir work_dirs/custom18_rtmpose-m ^
--cfg-options ^
model.backbone.init_cfg.checkpoint="checkpoints/cspnext-m_udp-aic-coco_210e-256x192-f2f7d6f6_20230130.pth" ^
train_dataloader.batch_size=8 ^
val_dataloader.batch_size=8 ^
train_dataloader.num_workers=0 ^
val_dataloader.num_workers=0 ^
train_dataloader.persistent_workers=False ^
val_dataloader.persistent_workers=FalseBatch size guide (Quadro P1000, 4 GB VRAM):
| Input size | Suggested batch size |
|---|---|
| 256×192 | 8 (try 4 if OOM) |
| 448×448 | 2 (try 4 if stable) |
Checkpoints are saved to work_dirs/custom18_rtmpose-m/. The best model (by validation AP) is kept automatically.
Logs appear in the terminal and in work_dirs/<run>/<timestamp>.log. Example:
Epoch(train) [1][50/2371] loss: 0.090868 acc_pose: 0.037325
Press Ctrl+C in the terminal, or:
tasklist | findstr /i python
taskkill /PID <pid> /FAfter training, use the saved checkpoint with your custom config:
python demo/topdown_demo_with_mmdet.py ^
projects/rtmpose/rtmdet/person/rtmdet_nano_320-8xb32_coco-person.py ^
https://download.openmmlab.com/mmpose/v1/projects/rtmpose/rtmdet_nano_8xb32-100e_coco-obj365-person-05d8511e.pth ^
configs/body_2d_keypoint/rtmpose/custom/rtmpose-m_8xb64-200e_custom18-256x192.py ^
work_dirs/custom18_rtmpose-m/best_coco_AP_epoch_*.pth ^
--input your_video.mp4 ^
--output-root vis_results ^
--device cuda:0Replace best_coco_AP_epoch_*.pth with the actual checkpoint filename from work_dirs/.
To train on a new keypoint schema (not 18 ALF keypoints):
- Rename or copy
custom18_keypoints.pyand set your keypoint names and skeleton. - Create a new metainfo file under
configs/_base_/datasets/. - Copy a training config and update:
num_keypointsmodel.head.out_channelsmetainfo=dict(from_file='configs/_base_/datasets/your_dataset.py')data_rootand annotation paths
- Write or adapt a conversion script if your labels are not YOLO pose format.
- Train with the new config.
| Error | Fix |
|---|---|
albumentations is not installed |
pip install albumentations==1.4.24 |
numpy.dtype size changed |
pip install numpy==1.26.4 --force-reinstall --no-deps then reinstall xtcocotools |
persistent_workers needs num_workers > 0 |
Add train_dataloader.persistent_workers=False (and val/test) |
can't open file tools/train.py |
pip install -e . from repo root |
| CUDA out of memory | Lower train_dataloader.batch_size (e.g. 4 or 2) |
| Training very slow on Windows | Use num_workers=0; first epoch is always slowest |
- Keypoints Studio — keypoint annotation tool (YOLO pose export)
- RTMPose project docs
- MMPose custom dataset tutorial
- RTMPose paper / model zoo