Skip to content

Latest commit

 

History

History
89 lines (56 loc) · 1.96 KB

README.md

File metadata and controls

89 lines (56 loc) · 1.96 KB

Environment Setup

To set up the environment, ensure you have the following versions installed:

  • CUDA: 12.1
  • Python: 3.10

To install the required Python dependencies, run the following command:

pip install -r requirements.txt

Download data:

English Data

Chinese Data

Data Preprocessing

You need to preprocess the datasets into a unified format, compatible with the training pipeline.

Required Format

Ensure that the data is structured in the same format as the provided examples:

  • data/epo_data_sample.json
  • data/sft_data_example.json

Additionally, you will need to modify the data/dataset_info.json file to match the specifics of your dataset configuration.

Training Pipeline

SFT Stage 1:

bash bash/train_gec_sft_stage1.sh
bash bash/export_model.sh  # merge lora weight

SFT Stage 2:

bash bash/train_gec_sft_stage2.sh

Sampling

bash bash/gec_pairwise_sampling.sh  # generate pairwise samples

EPO Training

bash bash/train_gec_epo.sh

Note: For Chinese GEC, you can find the corresponding scripts in the bash directory.

Evaluation

bash bash/gec_eval.sh  # for English GEC model
bash bash/cgec_eval.sh  # for Chinese GEC model

Acknowledgements

This project is built upon LLaMA-Factory and utilizes the following tools for evaluation:

We are grateful for their contributions.