Skip to content

Latest commit

 

History

History
42 lines (30 loc) · 1.9 KB

README.md

File metadata and controls

42 lines (30 loc) · 1.9 KB

CodeGen2

Official research release for the CodeGen2 models (1B, 3B, 7B, 16B) for Program Synthesis as presented in ICLR 2023:

Title: CodeGen2: Lessons for Training LLMs on Programming and Natural Languages

Authors: Erik Nijkamp*, Hiroaki Hayashi*, Caiming Xiong, Silvio Savarese, and Yingbo Zhou (* indicates equal contribution)

Hugging Face Integration

Model checkpoints are published at Hugging Face Hub.

Model cards outline how to use the model for causal and infill sampling.

Sampling

Program synthesis in the form of auto-regressive sampling can be performed as follows:

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("checkpoints/codegen2-6B")
model = AutoModelForCausalLM.from_pretrained("checkpoints/codegen2-6B", trust_remote_code=True, torch_dtype=torch.float16, revision="main")
inputs = tokenizer("# this function prints hello world", return_tensors="pt")
sample = model.generate(**inputs, max_length=128)
print(tokenizer.decode(sample[0], truncate_before_pattern=[r"\n\n^#", "^'''", "\n\n\n"]))

Citation

@article{Nijkamp2023codegen2,
  title={CodeGen2: Lessons for Training LLMs on Programming and Natural Languages},
  author={Nijkamp, Erik and Hayashi, Hiroaki and Xiong, Caiming and Savarese, Silvio and Zhou, Yingbo},
  journal={arXiv preprint},
  year={2023}
}