用BERT做分类任务

data_preprocess.py是将datasets中的query单独提取出来保存的代码

generate_qa_one_answer.py是针对数据集做inference得到answer的代码

默认使用同样的10000个query做train，1000个query做test

其中，qwen模型生成的answer可能会有格式错误等问题，在qwen_add.py中做了修正

运行分类代码：bash generate_and_BERT.sh

llama-3-8b-instruct.py、mistral……、qwen2-7b……是每个模型单独做inference的代码

注意，部分python版本无法导入mistral_inference包，于是本项目将该包需要的部分直接整理了出来

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
mistral_inference		mistral_inference
BERT.py		BERT.py
Mistral-nemo-12B-instruct.py		Mistral-nemo-12B-instruct.py
data_preprocess.py		data_preprocess.py
generate_and_BERT.sh		generate_and_BERT.sh
generate_qa_one_answer.py		generate_qa_one_answer.py
llama-3-8b-instruct.py		llama-3-8b-instruct.py
process_before_train.py		process_before_train.py
qwen2-7b-instruct.py		qwen2-7b-instruct.py
qwen_add.py		qwen_add.py
readme.md		readme.md
training_log.log		training_log.log

Provide feedback