Name		Name	Last commit message	Last commit date
parent directory ..
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
sample.py		sample.py

README.md

CodeGen2.5

Official research release for the CodeGen2.5 models for Program Synthesis.

Title: CodeGen2.5: Small, but mighty

Authors: Erik Nijkamp*, Hiroaki Hayashi*, Yingbo Zhou, Caiming Xiong (* equal contribution)

Hugging Face Integration

Model checkpoints are published at Hugging Face Hub.

CodeGen2.5-7B-multi (Apache-2.0)
CodeGen2.5-7B-mono (Apache-2.0)
CodeGen2.5-7B-instruct (Research purposes only)

Model cards outline how to use the model for causal and infill sampling. Please refer to each model card for more details.

The models are pre-trained on the StarCoderData, a programming language dataset developed by BigCode.

Requirements

transformers>=4.29.2
tiktoken==0.4.0

Sampling

Program synthesis in the form of auto-regressive sampling can be performed as follows:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Salesforce/codegen25-7b-mono", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("Salesforce/codegen25-7b-mono")
inputs = tokenizer("def hello_world():", return_tensors="pt")
sample = model.generate(**inputs, max_length=128)
print(tokenizer.decode(sample[0]))

Citation

Please cite CodeGen2 paper:

@article{Nijkamp2023codegen2,
  title={CodeGen2: Lessons for Training LLMs on Programming and Natural Languages},
  author={Nijkamp, Erik and Hayashi, Hiroaki and Xiong, Caiming and Savarese, Silvio and Zhou, Yingbo},
  journal={ICLR},
  year={2023}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

codegen25

codegen25

README.md

CodeGen2.5

Hugging Face Integration

Requirements

Sampling

Citation

Files

codegen25

Directory actions

More options

Directory actions

More options

Latest commit

History

codegen25

Folders and files

parent directory

README.md

CodeGen2.5

Hugging Face Integration

Requirements

Sampling

Citation