A machine learning-enhanced automatic theorem prover for the Coq formal proof system. This system combines large language models with retrieval-augmented generation to automatically generate formal proofs.
- Python 3.11+ (recommended)
Basic Usage (proof generation with API models):
pip install -r requirements.txtAdvanced Usage (fine-tuning and local model deployment):
pip install -r requirements_ft.txtRecommended Usage: Set up environment first, then run commands:
# Initialize environment (run once per shell session)
source init_env.sh
# Then run commands normally
python [command]Alternative: Run commands with explicit PYTHONPATH:
PYTHONPATH=. python [command]Follow these steps in order to set up the complete system:
The system requires a modified Coq compiler that outputs structural information during compilation.
-
Install Prerequisites:
# OCaml and OPAM (OCaml Package Manager) sudo apt-get install opam build-essential # For Docker environments: opam init --disable-sandboxing -y # For regular systems: # opam init -y eval $(opam env) opam switch create coq-8.11 4.10.0 eval $(opam env) # Install dependencies opam install dune num ocamlfind camlp5
-
Build Modified Coq:
# Automated installation (recommended) ./install_coq.sh $HOME/coq-structured
This script will:
- Clone Coq 8.11 source code
- Apply our modifications
- Build and install the modified Coq
- Automatically update
config.jsonwith correct binary paths
Manual Installation: If you prefer manual control, see
src/README.mdfor detailed step-by-step instructions.See
src/README.mdfor detailed installation instructions.
Configure LLM API credentials (required for both data extraction and proof generation):
# Reasoning model - for main proof generation and critical reasoning
export API_KEY_REASONING="your-reasoning-api-key"
export BASE_URL_REASONING="your-reasoning-base-url"
export MODEL_REASONING="your-reasoning-model-name"
# Explanation model - for state explanations and auxiliary tasks
export API_KEY_EXPLANATION="your-explanation-api-key"
export BASE_URL_EXPLANATION="your-explanation-base-url"
export MODEL_EXPLANATION="your-explanation-model-name"These environment variables are used throughout the system for:
- Data augmentation during extraction (
data_extraction/) - Proof generation and tactic suggestions (
coq_prover/) - State explanations and proof summaries
Process Coq source code to extract definitions, proof states, and training data.
-
Configure Data Sources:
- Update data/package_mapping.json with Coq packages to process (defaults provided for common libraries)
- Update data/sorted_order_all.json with processing order (defaults provided for common libraries)
Note: Default configurations are provided for most common Coq libraries. If you need to process libraries not included in the defaults, you'll need to add them manually. Automatic dependency resolution via
coqdepwill be added in future versions.Set Data Directory: In
config.json, setdata_dirto a directory containing your Coq library folders:{ "paths": { "data_dir": "/path/to/coq-libraries", "output_data": "./data/" } }data_dir: Directory containing Coq library folders (e.g.,stdlib/,mathcomp/,iris/) to processoutput_data: Directory for generated data files (defaults to./data/if not specified)emb_model_path: Path to the embedding model (embed data for semantic retrieval)
-
Run Data Extraction:
# Extract structured data from Coq packages python data_extraction/main.py --mode data_generation # Process new theorems (if needed) python data_extraction/main.py --mode new_theorem
This generates:
def_table.jsonl: Extracted definitions and their contextsps_table.jsonl: Proof states and transitionsdef_table_emb.jsonl: Semantic embeddings for retrieval- Package dependency information
See
data_extraction/README.mdfor detailed pipeline information.
Configure System Paths:
# Validate and explain configuration options
python config_helper.py validate
python config_helper.py explainUpdate config.json with:
- Paths: Coq binaries, data files, model locations
- Flags: Feature toggles and processing modes
- Params: Search parameters and worker limits
The config helper will guide you through required vs optional settings.
With the modified Coq, extracted data, and configuration complete, you can now generate proofs.
-
Basic Proof Generation:
# Generate proofs for a specific theorem python coq_prover/main.py --mode theorem --file path/to/file.v --theorem theorem_name # Process an entire package python coq_prover/main.py --mode package --package package_name # Continue from existing tactics python coq_prover/main.py --mode generate --file path/to/file.v --theorem theorem_name --tactics "tactic1" "tactic2"
-
Distributed Evaluation:
# Run on test dataset with multiple workers python coq_prover/main.py --mode all --shard 0See
coq_prover/README.mdfor advanced usage and configuration options.
The system consists of five main components:
-
Modified Coq Compiler (
src/): Custom OCaml modifications to Coq 8.11 that extract structural information during compilation and proof checking. -
Python Coq Wrapper (
coqc.py): A Python-based wrapper around the modified Coq compiler that handles:- Dependency Resolution: Automatically resolves and imports required Coq libraries
- Error Handling: Intelligent error filtering and timeout management
- Batch Processing: Efficient parallel compilation of multiple files
- State Management: Handles proof state extraction and validation
- Integration: Seamless integration with the ML pipeline
This wrapper makes the modified Coq compiler much easier to use from Python code and handles the complex dependency management that would otherwise require manual intervention.
-
Data Extraction Pipeline (
data_extraction/): Processes Coq source code to create structured datasets for machine learning, including definitions, proof states, and semantic embeddings. -
Proof Generation Engine (
coq_prover/): ML-enhanced proof generator using LLMs with retrieval-augmented generation, beam search, and hierarchical error correction. -
Configuration Management: Centralized configuration system with validation and explanation tools.
- Retrieval-Augmented Generation: Semantic search for relevant definitions and proof patterns
- Hierarchical Error Correction: Multi-level tactic refinement with different error handling strategies
- Beam Search: Parallel exploration of multiple proof branches
- Public Notes System: Cross-step knowledge accumulation for better context-aware reasoning
- Distributed Processing: Scalable evaluation across multiple workers
- Fine-tuning Support: Tools for training custom models on extracted proof data
The system supports multiple LLM backends:
- API-based: Huoshan, Aliyun, DeepSeek commercial APIs
- Local: VLLM-based local model serving
- Fine-tuned: Custom models trained on extracted Coq data
Use the configuration helper to manage your setup:
# Validate current configuration
python config_helper.py validate
# View all configuration options with explanations
python config_helper.py explainstructural_coq_prover/
├── src/ # Modified Coq compiler source files
├── coqc.py # Python wrapper for Coq compiler (dependency resolution)
├── data_extraction/ # Data processing pipeline
├── coq_prover/ # ML proof generation system
├── data/ # Configuration and test data
├── config.json # Main configuration file
├── config_helper.py # Configuration management tool
For detailed instructions on each component, see the README files in the respective subdirectories.
@Yanzhen Lu, @Hanbin Yang, @Xiaodie Wang,@Ge Zhang, @Biao Li, @Chenxu Fu, @Chao Li
For detailed implementation and methodology, see our paper: https://arxiv.org/pdf/2507.02541