Vapro is a light-weight performance variance detection and diagnosis tool without requiring the source code of applications. It is practical for production-run parallel applications.
- MPI consistent with applications
- jsoncpp
- papi
- libunwind
- modify
CPU_FREQ
to real value inpapi_wrap.cpp
- compile as
mkdir build && cd build
cmake ..
make
We can use Vapro easily with two steps, profiling and analysis.
We can enable Vapro by setting LD_PRELOAD
to preload the Vapro library before applications start.
export LD_PRELOAD=<path_to_libpapicnt.so>
Then, we can run the applications dierectly. Alternatively, we can make LD_PRELOAD
only effective on the applications by
LD_PRELOAD=<path_to_libpapicnt.so> mpirun ./application_command
Vapro saves results in the current working directory. There are four classes of files for different information.
- log0_*: all calculation events
- log1_*: all communication events
- log2_*: relative performance data of calculation
- log3_*: relative performance data of communication
Asterisks in the filenames are corresponding MPI ranks.
- Supported CPU backends
- ✔ PAPI
- ✔ Linux perf
- ⬜ pmu-tools for Intel CPU
- Supported GPU backends
- ⬜ CUDA
- ⬜ Integrated visualization
The development of Vapro is based on pull requests on Github. Before requesting for merging, a PR should satisfy the following requirements
- Receive at least one approval from reviewers.
- PR title should be concise since it is going to be the commit message in the main branch after merging and squashing.
Please cite our papers in your publications if they help your research:
@article{zhai2022detecting,
title={Detecting Performance Variance for Parallel Applications Without Source Code},
author={Zhai, Jidong and Zheng, Liyan and Zhang, Feng and Tang, Xiongchao and Wang, Haojie and Yu, Teng and Jin, Yuyang and Song, Shuaiwen Leon and Chen, Wenguang},
journal={IEEE Transactions on Parallel and Distributed Systems},
volume={33},
number={12},
pages={4239--4255},
year={2022},
publisher={IEEE}
}
@inproceedings{zheng2022vapro,
title={Vapro: Performance variance detection and diagnosis for production-run parallel applications},
author={Zheng, Liyan and Zhai, Jidong and Tang, Xiongchao and Wang, Haojie and Yu, Teng and Jin, Yuyang and Song, Shuaiwen Leon and Chen, Wenguang},
booktitle={Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming},
pages={150--162},
year={2022}
}