CoderEval-Prompt-Inference

This project generates code samples based on provided function signatures, descriptions, and input code using a language model. It supports both Python and Java prompts and leverages the transformers library for generating code samples.

Project Structure

.
├── .gitignore
├── main.py
├── module.py
├── prompt_gen/
│   ├── CoderEval-Input4Models/
│   │   ├── CEJavaHumanLabel.jsonl
│   │   ├── CEJavaRaw.jsonl
│   │   ├── CEPythonHumanLabel.jsonl
│   │   ├── CEPythonRaw.jsonl
│   ├── load_python_prompts.py
│   ├── prompts/
│   │   ├── java_generated_prompts.jsonl
│   │   ├── python_generated_prompts.jsonl
├── README.md
├── requirements.txt
├── results/
└── test.py

Files and Directories

main.py: The main script to load prompts, set up the model, and generate code samples.
module.py: Contains utility functions for loading prompts and extracting Python code.
prompt_gen/: Directory containing input JSONL files and scripts for generating prompts.
- CoderEval-Input4Models/: Contains raw and human-labeled JSONL files for Python and Java.
- load_python_prompts.py: Script to generate prompts from raw JSONL files.
- prompts/: Directory containing pre-generated prompts for Python and Java.
requirements.txt: Lists the dependencies required for the project.
results/: Directory to store the generated results.
test.py: Script for testing purposes.

Setup

Clone the repository:

git clone <repository-url>
cd CoderEval-Prompt-Inference

Install the required dependencies:
```
pip install -r requirements.txt
```
Ensure you have the necessary model files cached or downloaded.

Usage

Generate prompts from raw JSONL files:

python prompt_gen/load_python_prompts.py

Run the main script to generate code samples:
```
python main.py
```
The generated code samples will be saved in the generated_outputs directory.

Functions

`module.py`

load_prompts() -> tuple
Loads and returns the Python and Java prompts from pre-generated JSONL files.
extract_python_code(code_string: str) -> str
Extracts Python code from the given content, removing all comments and docstrings.

`load_python_prompts.py`

generate_prompts(file_path, language)
Generates prompts from a JSONL file and stores them in a dictionary.

Model and Tokenizer

The models and tokenizers are loaded and cached in the ./models directory.

Logging

The project uses the logging module to log information and errors during execution. Logs are printed to the console.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CoderEval-Prompt-Inference

Project Structure

Files and Directories

Setup

Usage

Functions

`module.py`

`load_python_prompts.py`

Model and Tokenizer

Logging

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
prompt_gen		prompt_gen
.gitignore		.gitignore
README.md		README.md
main.py		main.py
module.py		module.py
requirements.txt		requirements.txt
test.py		test.py

jjosorioc/CoderEval-Prompt-Inference

Folders and files

Latest commit

History

Repository files navigation

CoderEval-Prompt-Inference

Project Structure

Files and Directories

Setup

Usage

Functions

module.py

load_python_prompts.py

Model and Tokenizer

Logging

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

`module.py`

`load_python_prompts.py`

Packages