Learning Crowd Scale and Distribution for Weakly Supervised Crowd Counting and Localization (TCSVT)

Introduction

This is the official PyTorch implementation of paper: Learning Crowd Scale and Distribution for Weakly Supervised Crowd Counting and Localization (extended from paper Weakly-supervised scene-specific crowd counting using real-synthetic hybrid data). This paper proposes a weakly supervised crowd counting and localization method based on scene-specific synthetic data for surveillance scenarios, which can accurately predict the number and location of person without any manually labeled point-wise or count-wise annotations.

Getting started

preparatoin

Clone this repo in the directory

Install dependencies. We use python 3.7 and pytorch == 1.10.0 : http://pytorch.org.

conda create -n LCSD python=3.7
conda activate LCSD
conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forge
cd ${LCSD}
pip install -r requirements.txt
pip install git+http://github.com/nikitadurasov/masksembles

Download the resource files used by the code from this link, including datasets, pre-trained models, pedestrian gallery, and negetive samples. Unzip resources.zip to resources. The resources folder is organized as follows:

$resources/
├── CityUHK-X  # dataset name
│   ├── scene_001  # scene name
│   │   ├── CityUHK-X_scene_001_20_40  # specific dataset for this scene
│   │   │   ├── train_data
│   │   │   │   ├── images
│   │   │   │   │   └── xx.jpg
│   │   │   │   ├── ground_truth_txt
│   │   │   │   │   └── xx.txt
│   │   │   ├── test_data
│   │   │   ├── train_data.txt
│   │   │   └── test_data.txt
│   │   └── scene.jpg  #  scene image without person
│   ├── scene_002
│   ├── ...
│   └── scene_k
├── Mall
│   ├── scene_001  # only one scene for Mall
│   │   ├── mall_800_1200
│   │   │   ├── train_data
│   │   │   │   ├── images
│   │   │   │   │   └── xx.jpg
│   │   │   │   ├── ground_truth_txt
│   │   │   │   │   └── xx.txt
│   │   │   ├── test_data
│   │   │   ├── train_data.txt
│   │   │   └── test_data.txt
│   │   └── scene.jpg
├── UCSD
│   ├── scene_001
│   │   ├── ucsd_800_1200
│   │   │   ├── train_data
│   │   │   │   ├── images
│   │   │   │   │   └── xx.jpg
│   │   │   │   ├── ground_truth_txt
│   │   │   │   │   └── xx.txt
│   │   │   ├── test_data
│   │   │   ├── train_data.txt
│   │   │   └── test_data.txt
│   │   └── scene.jpg
├── pedestrians  #  pedestrian gallery
│   ├── GCC #  default
│   │   └── xx.png
│   ├── SHHB
│   └── LSTN
├── indoor_negetive_samples 
│   └── xx.jpg
├── outdoor_negetive_samples
│   └── xx.jpg
├── darknet53.conv.74  #  pre-trained model for detection
└── net_G_last.pth.txt  #  pre-trained model for image harmonization

Training

Check some parameters in train.py before training:

Use dataset = Mall to set the dataset.
Use scene = scene_001 to set the scene of the dataset. Mall and UCSD only have one scene, so set scene as scene_001.
Use resource-path = {$resources} to set the path of the resource folder downloaded above.
Use scene-dataset = mall_800_1200 to set the specific dataset of the scene.
Use device = 0 to set the gpu id for training.
run python train.py.

Test

Check some parameters in test.py before test:

Use dataset = Mall to set the dataset.
Use scene = scene_001 to set the scene of the dataset.
Use resource-path = {$resources} to set the path of the resource folder.
Use scene-dataset = mall_800_1200 to set the specific dataset of the scene.
Use model_path = xxx.pth to set the pre-trained counter in the training stage.
Use test-name = xxx to set the test name, which will be used to name the folder for saving the test results.
Use device = 0 to set the gpu used for test.
run test.py

Citation

If you find this project is useful for your research, please cite:

@ARTICLE{LCSD,
  author={Fan, Yaowu and Wan, Jia and Ma, Andy J.},
  journal={IEEE Transactions on Circuits and Systems for Video Technology}, 
  title={Learning Crowd Scale and Distribution for Weakly Supervised Crowd Counting and Localization}, 
  year={2025},
  volume={35},
  number={1},
  pages={713-727}
  }

@INPROCEEDINGS{ICASSP_2023_FAN
  author={Fan, Yaowu and Wan, Jia and Yuan, Yuan and Wang, Qi},
  booktitle={IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, 
  title={Weakly-Supervised Scene-Specific Crowd Counting Using Real-Synthetic Hybrid Data}, 
  year={2023},
  pages={1-5}
}

Acknowledgement

The released PyTorch training script borrows some codes from the masksembles, yolov3, and RainNet. If you think this repo is helpful for your research, please consider cite them.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
datasets		datasets
figures		figures
models		models
utils		utils
README.md		README.md
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Learning Crowd Scale and Distribution for Weakly Supervised Crowd Counting and Localization (TCSVT)

Introduction

Getting started

preparatoin

Training

Test

Citation

Acknowledgement

About

Uh oh!

Releases

Packages

Languages

fyw1999/LCSD

Folders and files

Latest commit

History

Repository files navigation

Learning Crowd Scale and Distribution for Weakly Supervised Crowd Counting and Localization (TCSVT)

Introduction

Getting started

preparatoin

Training

Test

Citation

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages