🔥 A very simple but efficient framework for ReID tasks/models. 🔥
We proposed:
- Training-Free Feature Centralization framework (Pose2ID) that can be directly applied to different ReID tasks and models, even an ImageNet pre-trained model without ReID training;
- Identity-Guided Pedestrian Generation (IPG) paradigm, leveraging identity features to generate high-quality images of the same identity in different poses to achieve feature centralization;
- Neighbor Feature Centralization (NFC) based on sample's neighborhood, discovering hidden positive samples of gallery/query set to achieve feature centralization.
- [2025.03.06] 🔥 Pretrained weights is available on HuggingFace!
- [2025.03.04] 🔥 Paper is available on Arxiv!
- [2025.03.03] 🔥 Official codes has released!
- [2025.02.27] 🔥🔥🔥 Pose2ID is accepted to CVPR 2025!
There are two parts of our project: Identity-Guided Pedestrian Generation (IPG) and Neighbor Feature Centralization (NFC).
IPG using generated pedestrian images to centralize features. Using simple codes could implement:
'''
normal reid feature extraction to get feats
'''
feats_ipg = torch.zeros_like(feats)
# fuse features of generated positive samples with different poses
for i in range(num_poses):
feats_ipg += reid_model(feats_pose[i]) # Any reid model
eta = 1 # control the impact of generated images (considering the quality)
# centralize features and normalize to original distribution
feats = torch.nn.functional.normalize(feats + eta * feats_ipg, dim=1, p=2) # L2 normalization
'''
compute distance matrix or post-processing like re-ranking
'''
NFC explores each sample's potential positive samples from its neighborhood. It can also implement with few lines:
from NFC import NFC
feats = NFC(feats, k1 = 2, k2 = 2)
We proposed a quantitative metric (ID²) for Identity Density to replce visualization tools like t-SNE, which is random and only focus on few samples.
It can be used in one line:
from ID2 import ID2
density = ID2(feats, pids) # each ID's density
density.mean(0) # global density
where feats
is the features extracted by ReID model and pids
is the corresponding person IDs.
All the experiments are conducted with the offcial codes and pretrained models. We appreciate their official repositories and great works:
TransReID loads a ViT pre-trained model on ImageNet for training on the ReID task. This experiment conduct on that pre-trained model which is NOT trained on ReID task.
git clone https://github.com/yuanc3/Pose2ID
cd Pose2ID/IPG
Create conda environment (Recommended):
conda create -n IPG python=3.9
conda activate IPG
Install packages with pip
pip install -r requirements.txt
-
Download official models from:
-
Download our IPG pretrained weights from HuggingFace or Google Drive, and put them in the
pretrained
directory.git lfs install git clone https://huggingface.co/yuanc3/Pose2ID pretrained
The pretrained are organized as follows.
./pretrained/ ├── denoising_unet.pth ├── reference_unet.pth ├── IFR.pth ├── pose_guider.pth └── transformer_20.pth
Run the inference.py
script. It will generate with poses in the standard_poses
for each reference image in ref
. The output images will be saved in the output
.
python inference.py --ckpt_dir pretrained --pose_dir standard_poses --ref_dir ref --out_dir output
--ckpt_dir
: directory of pretrained weights,
--pose_dir
: directory of target poses (we provide 8 poses used in our experiment),
--ref_dir
: directory of reference images (we provide 10 reference imgs),
--out_dir
: directory of output images.
Here, we provide our generated images on Gallery and Query of test set on Market1501 with our 8 representative poses.
We use DWpose to get poses with 18 keypoints.Please follow their official instructions.
Status | Milestone | ETA |
---|---|---|
🚀 | Training codes | TBD |
🚀 | IPG model trained on more data | TBD |
🚀 | IPG model with modality transfer ability (RGB2IR) | TBD |
🚀 | Video-IPG model | TBD |