Skip to content

iSEngLab/Cream

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

No Man is an Island: Towards Fully Automatic Programming by Code Search, Code Generation and Program Repair

Environment Setup

This project utilizes CodeBERT for embedding retrieval and CodeLlama-7B for code generation. Please ensure you have access to both models.

You can download the models from the following links:

Folder Structure

.
├── Cream/
│   ├── generate.py                # Script for code generation
│   ├── repair.py                  # Script for code repair
│   ├── retrieval_EM.py            # Code retrieval using the EM algorithm
│   ├── retrieve_BM25.py           # Code retrieval using the BM25 algorithm
│   └── validate.py                # Script for validating generated code
├── Evaluation_1/
│   ├── generation+repair.csv      # Generated code (generation + repair)
│   ├── generation+repair_valid.csv# Validation results for generation + repair
│   ├── only_generation.csv        # Generated code (only generation)
│   ├── only_generation_valid.csv  # Validation results for only generation
│   ├── retrieval(EM)+generation+repair.csv # EM-based retrieval, generation, and repair results
│   ├── retrieval(EM)+generation+repair_valid.csv # Validation for EM-based retrieval, generation, and repair
│   ├── retrieval(EM)+generation.csv # EM-based retrieval and code generation results
│   ├── retrieval(EM)+generation_valid.csv # Validation for EM-based retrieval and code generation
│   ├── retrieval(IR)+generation+repair.csv  # IR-based retrieval, generation, and repair results
│   ├── retrieval(IR)+generation+repair_valid.csv # Validation for IR-based retrieval, generation, and repair
│   ├── retrieval(IR)+generation.csv # IR-based retrieval and code generation results
│   └── retrieval(IR)+generation_valid.csv # Validation for IR-based retrieval and code generation
├── Evaluation_2/
│   ├── Case_1.md                  # Evaluation case 1 documentation
│   ├── Case_2.md                  # Evaluation case 2 documentation
│   └── Case_3.md                  # Evaluation case 3 documentation
├── MBPP_data/
│   ├── mbpp.jsonl                 # MBPP dataset (JSONL format)
│   └── sanitized-mbpp.json        # Sanitized MBPP dataset (JSON format)
└── README.md                      # Project README

Running the Code

  1. Code Generation: To generate code using the CodeLlama-7B model, run:
python Cream/generate.py
  1. Code Repair: To perform code repair on an existing code, run:
python Cream/repair.py

3.Code Retrieval (EM): To retrieve relevant code using embedding, run:

python Cream/retrieval_EM.py
  1. Code Retrieval (BM25): To retrieve relevant code using the BM25 algorithm, run:
python Cream/retrieve_BM25.py

5.Validation: To validate the generated or repaired code, run:

python Cream/validate.py

Contributing

We welcome contributions to improve the codebase! If you have suggestions or bug fixes, please feel free to submit a pull request.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages