No Man is an Island: Towards Fully Automatic Programming by Code Search, Code Generation and Program Repair
This project utilizes CodeBERT for embedding retrieval and CodeLlama-7B for code generation. Please ensure you have access to both models.
You can download the models from the following links:
.
├── Cream/
│ ├── generate.py # Script for code generation
│ ├── repair.py # Script for code repair
│ ├── retrieval_EM.py # Code retrieval using the EM algorithm
│ ├── retrieve_BM25.py # Code retrieval using the BM25 algorithm
│ └── validate.py # Script for validating generated code
├── Evaluation_1/
│ ├── generation+repair.csv # Generated code (generation + repair)
│ ├── generation+repair_valid.csv# Validation results for generation + repair
│ ├── only_generation.csv # Generated code (only generation)
│ ├── only_generation_valid.csv # Validation results for only generation
│ ├── retrieval(EM)+generation+repair.csv # EM-based retrieval, generation, and repair results
│ ├── retrieval(EM)+generation+repair_valid.csv # Validation for EM-based retrieval, generation, and repair
│ ├── retrieval(EM)+generation.csv # EM-based retrieval and code generation results
│ ├── retrieval(EM)+generation_valid.csv # Validation for EM-based retrieval and code generation
│ ├── retrieval(IR)+generation+repair.csv # IR-based retrieval, generation, and repair results
│ ├── retrieval(IR)+generation+repair_valid.csv # Validation for IR-based retrieval, generation, and repair
│ ├── retrieval(IR)+generation.csv # IR-based retrieval and code generation results
│ └── retrieval(IR)+generation_valid.csv # Validation for IR-based retrieval and code generation
├── Evaluation_2/
│ ├── Case_1.md # Evaluation case 1 documentation
│ ├── Case_2.md # Evaluation case 2 documentation
│ └── Case_3.md # Evaluation case 3 documentation
├── MBPP_data/
│ ├── mbpp.jsonl # MBPP dataset (JSONL format)
│ └── sanitized-mbpp.json # Sanitized MBPP dataset (JSON format)
└── README.md # Project README
- Code Generation: To generate code using the CodeLlama-7B model, run:
python Cream/generate.py
- Code Repair: To perform code repair on an existing code, run:
python Cream/repair.py
3.Code Retrieval (EM): To retrieve relevant code using embedding, run:
python Cream/retrieval_EM.py
- Code Retrieval (BM25): To retrieve relevant code using the BM25 algorithm, run:
python Cream/retrieve_BM25.py
5.Validation: To validate the generated or repaired code, run:
python Cream/validate.py
We welcome contributions to improve the codebase! If you have suggestions or bug fixes, please feel free to submit a pull request.