No Man is an Island: Towards Fully Automatic Programming by Code Search, Code Generation and Program Repair

Environment Setup

This project utilizes CodeBERT for embedding retrieval and CodeLlama-7B for code generation. Please ensure you have access to both models.

You can download the models from the following links:

Folder Structure

.
├── Cream/
│   ├── generate.py                # Script for code generation
│   ├── repair.py                  # Script for code repair
│   ├── retrieval_EM.py            # Code retrieval using the EM algorithm
│   ├── retrieve_BM25.py           # Code retrieval using the BM25 algorithm
│   └── validate.py                # Script for validating generated code
├── Evaluation_1/
│   ├── generation+repair.csv      # Generated code (generation + repair)
│   ├── generation+repair_valid.csv# Validation results for generation + repair
│   ├── only_generation.csv        # Generated code (only generation)
│   ├── only_generation_valid.csv  # Validation results for only generation
│   ├── retrieval(EM)+generation+repair.csv # EM-based retrieval, generation, and repair results
│   ├── retrieval(EM)+generation+repair_valid.csv # Validation for EM-based retrieval, generation, and repair
│   ├── retrieval(EM)+generation.csv # EM-based retrieval and code generation results
│   ├── retrieval(EM)+generation_valid.csv # Validation for EM-based retrieval and code generation
│   ├── retrieval(IR)+generation+repair.csv  # IR-based retrieval, generation, and repair results
│   ├── retrieval(IR)+generation+repair_valid.csv # Validation for IR-based retrieval, generation, and repair
│   ├── retrieval(IR)+generation.csv # IR-based retrieval and code generation results
│   └── retrieval(IR)+generation_valid.csv # Validation for IR-based retrieval and code generation
├── Evaluation_2/
│   ├── Case_1.md                  # Evaluation case 1 documentation
│   ├── Case_2.md                  # Evaluation case 2 documentation
│   └── Case_3.md                  # Evaluation case 3 documentation
├── MBPP_data/
│   ├── mbpp.jsonl                 # MBPP dataset (JSONL format)
│   └── sanitized-mbpp.json        # Sanitized MBPP dataset (JSON format)
└── README.md                      # Project README

Running the Code

Code Generation: To generate code using the CodeLlama-7B model, run:

python Cream/generate.py

Code Repair: To perform code repair on an existing code, run:

python Cream/repair.py

3.Code Retrieval (EM): To retrieve relevant code using embedding, run:

python Cream/retrieval_EM.py

Code Retrieval (BM25): To retrieve relevant code using the BM25 algorithm, run:

python Cream/retrieve_BM25.py

5.Validation: To validate the generated or repaired code, run:

python Cream/validate.py

Contributing

We welcome contributions to improve the codebase! If you have suggestions or bug fixes, please feel free to submit a pull request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

No Man is an Island: Towards Fully Automatic Programming by Code Search, Code Generation and Program Repair

Environment Setup

Folder Structure

Running the Code

Contributing

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Cream		Cream
Evaluation_1		Evaluation_1
Evaluation_2		Evaluation_2
MBPP_data		MBPP_data
README.md		README.md

iSEngLab/Cream

Folders and files

Latest commit

History

Repository files navigation

No Man is an Island: Towards Fully Automatic Programming by Code Search, Code Generation and Program Repair

Environment Setup

Folder Structure

Running the Code

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages