LLM Hallucinations in Practical Code Generation:

Phenomena, Mechanism, and Mitigation

Annotation: The data that is used for annotated in our study is stored in the annotation folder.

Datasets: To begin, download the evaluation datasets from datasets and extract them into the /dataset folder. In this experiment, we use the CoderEval dataset.

Repository: To begin, download the practical repositories from CoderEval and extract them into the /repos folder and /CoderEval/repos.

Models: In the mitigation experiment, we employ CodeGen, Pangu-α, ChatGPT, DeepSeekCoder, CodeLlama, and StarCoder2. Among the open-source models, we obtain the model through HuggingFace and conduct experiments. The closed-source model ChatGPT, which we experiment with using the OpenAI API interface.

Experimental Results: The experimental results are shown in the /testing-CoderEval/model_name folder. The experimental results in the file prediction_r0.jsonlare based on the Raw method, and the experimental results in the file prediction_r1.jsonl are based on the RAG-based method.

⚠ If you want to reproduce the results from scratch, please follow these steps:

Set-Up: Before starting the following process, it's essential to set up your environment by installing the necessary dependencies listed in the requirements.txt file. To install these dependencies, activate your Python virtual environment and run:

pip install -r requirements.txt

Mitigation Experiment

In the mitigation experiment section of our paper, we use two methods: Raw method and RAG-based method. If you want to verify and try this experiment, you can refer to the following show an inference example on CodeGen-mono-350M as follows:

python eval_original.py \
    --model=codegen \ 
    --max_len=1024 \
    --batch=4

For other backbone LLMs, You can refer to the structure in Model_Factor to customise the model you need to use. The structure is as follow:

MODEL_FACTORY = {
    "codegen": ("Salesforce/codegen-350M-mono", init_codegen, 2048)
}

For the settings of the function init_codegen, you can refer to the settings in the file model.py

def init_codegen(
    model_name="Salesforce/codegen-350M-mono",
    checkpoint=None,
    additional_tokens=None,
    device="cuda"
):
    tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False, padding_side="left")
    tokenizer.pad_token_id = tokenizer.eos_token_id
    additional_tokens = [] if additional_tokens is None else additional_tokens
    if len(additional_tokens) > 0:
        tokenizer.add_tokens([AddedToken(t, rstrip=False, lstrip=False) for t in additional_tokens])
    if checkpoint is None:
        model = AutoModelForCausalLM.from_pretrained(model_name)
        model.resize_token_embeddings(len(tokenizer))
    else:
        model = AutoModelForCausalLM.from_pretrained(checkpoint)
    
    model.to(device)
    
    return model, tokenizer

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
CoderEval		CoderEval
annotation		annotation
datasets		datasets
testing-CoderEval		testing-CoderEval
.gitignore		.gitignore
CoderEval4Python.json		CoderEval4Python.json
LICENSE		LICENSE
README.md		README.md
build_prompt.py		build_prompt.py
build_vector.py		build_vector.py
compute_score.py		compute_score.py
eval_original.py		eval_original.py
log_utils.py		log_utils.py
make_window.py		make_window.py
model.py		model.py
requirements.txt		requirements.txt
run_pipeline.py		run_pipeline.py
search_code.py		search_code.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

LLM Hallucinations in Practical Code Generation:

Mitigation Experiment

About

Uh oh!

Releases

Packages

Contributors 2

Languages

Uh oh!

License

Uh oh!

DeepSoftwareAnalytics/LLMCodingHallucination

Folders and files

Latest commit

History

Repository files navigation

LLM Hallucinations in Practical Code Generation:

Mitigation Experiment

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages