This repository contains the code, datasets, and experimental pipelines used in our study on evaluating lightweight open-source code LLMs for automated Python unit test generation. The project systematically benchmarks multiple prompting strategies and analyzes the resulting test quality across a wide range of metrics.
- src/: Source code organized into the following submodules:
- analysis/: Code for analyzing test smells.
- dataset/: Dataset preparation and processing.
- evaluation/: Evaluation scripts for model performance.
- example/: Example scripts demonstrating usage.
- inference/: Inference-related utilities.
- models/: Machine learning models for test smell detection.
- report/: Final report and slides.
- figure/: Visualizations and figures generated during analysis.
- results/: Stores results from experiments and analyses.
The documentation for this project is available in the report/ folder. It includes a detailed explanation of the methodology, experimental setup, and results. Refer to the report.pdf file for the complete project report.
The project requires Python. All necessary dependencies are listed in requirements.txt.
-
Download the dataset:
python src/dataset/download.py
This will download the required dataset into the appropriate folder.
-
Process the data:
python src/dataset/create_usable_function.py
This script processes the raw dataset and prepares it for use in experiments.
-
Generate unit tests using a code LLM:
python src/inference/inference.py --model_name <MODEL_NAME> --input_type <INPUT_TYPE> --prompt_type <PROMPT_TYPE> [other options]
Replace
<MODEL_NAME>,<INPUT_TYPE>, and<PROMPT_TYPE>with your desired settings. For example:python src/inference/inference.py --model_name Qwen/Qwen2.5-Coder-7B --input_type code --prompt_type minimal
See the script for all available arguments and options.
-
Evaluate the generated unit tests: Use the following scripts to analyze and evaluate your results. Replace
$FILE_PATHwith the path to your generated test file or directory.- Extract functions from generated tests:
python src/evaluation/extract_functions.py --file_path "$FILE_PATH" - Compute statistics on the extracted functions:
python src/evaluation/statistic.py --file_path "$FILE_PATH" - Analyze test smells in the generated tests:
python src/evaluation/test_smell_analysis.py --file_path "$FILE_PATH"
- Extract functions from generated tests:
You can use the following example to evaluate a specific result folder:
#!/bin/bash
FILE_PATH="results/test/qwen_3b/code/instruction_code/temp_0_2_tokens_512/assist_False/ver_1"
python src/evaluation/extract_functions.py --file_path "$FILE_PATH"
python src/evaluation/statistic.py --file_path "$FILE_PATH"
python src/evaluation/test_smell_analysis.py --file_path "$FILE_PATH"You can cite my project as follows:
@misc{llm_pytestgen_2025,
title={Lightweight Open-Source Models for Python Unit Test Generation},
author={Huu Binh Ta},
year={2025},
howpublished={CS6501 Final Project, University of Virginia},
url={https://github.com/Tahuubinh/llm_for_python_unittest_generator}
}
This project is licensed under the MIT License. See the LICENSE file for details.
