Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: 使用neural_search/recall/in_batch_negative 训练时候报错 TypeError: __init__() got an unexpected keyword argument 'enable_recompute' #9026

Open
1 task done
liuzhipengchd opened this issue Aug 28, 2024 · 5 comments
Assignees
Labels
bug Something isn't working stale

Comments

@liuzhipengchd
Copy link

liuzhipengchd commented Aug 28, 2024

软件环境

- paddlepaddle:
- paddlepaddle-gpu: 2.6.1 和 3.0 都试过了
- paddlenlp: 2.8和2.9 都试过了

当前版本
paddle-bfloat                     0.1.7
paddle2onnx                       1.2.7
paddlefsl                         1.1.0
paddlenlp                         2.8.1
paddleocr                         2.8.0
paddlepaddle-gpu                  2.6.1.post11

重复问题

  • I have searched the existing issues

错误描述

File "train_batch_neg.py", line 348, in do_train
    pretrained_model = AutoModel.from_pretrained(args.model_name_or_path, enable_recompute=args.use_recompute)
  File "/root/wxp/PaddleNLP/paddlenlp/transformers/auto/modeling.py", line 456, in from_pretrained
    return cls._from_pretrained(pretrained_model_name_or_path, task, *model_args, **kwargs)
  File "/root/wxp/PaddleNLP/paddlenlp/transformers/auto/modeling.py", line 320, in _from_pretrained
    return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)
  File "/root/wxp/PaddleNLP/paddlenlp/transformers/model_utils.py", line 2306, in from_pretrained
    model = cls(config, *init_args, **model_kwargs)
  File "/root/wxp/PaddleNLP/paddlenlp/transformers/utils.py", line 280, in __impl__
    init_func(self, *args, **kwargs)
TypeError: __init__() got an unexpected keyword argument 'enable_recompute'
I0828 13:48:53.111547 13264 process_group_nccl.cc:132] ProcessGroupNCCL destruct 
I0828 13:48:53.161453 13362 tcp_store.cc:289] receive shutdown event and so quit from MasterDaemon run loop
LAUNCH INFO 2024-08-28 13:48:54,184 Exit code 1

稳定复现步骤 & 代码

'''执行命令
python3 -u -m paddle.distributed.launch --gpus "1,3"
train_batch_neg.py
--device gpu
--save_dir ./checkpoints_medicine/
--batch_size 64
--learning_rate 5E-5
--epochs 3
--output_emb_size 1024
--model_name_or_path ernie-3.0-base-zh
--save_steps 10
--max_seq_length 64
--margin 0.2
--train_set_file /root/train_data/medicine/train_supervised.csv
--recall_result_dir "recall_result_dir"
--recall_result_file "recall_result.txt"
--hnsw_m 100
--hnsw_ef 100
--recall_num 50
--similar_text_pair_file "/root/train_data/search/supervised/dev.csv"
--corpus_file "/root/train_data/search/supervised/corpus.csv"
'''

@liuzhipengchd liuzhipengchd added the bug Something isn't working label Aug 28, 2024
@liuzhipengchd
Copy link
Author

self.use_task_id = use_task_id

需要在这里定义 self.enable_recompute = enable_recompute,默认 enable_recompute=False

@wawltor
Copy link
Collaborator

wawltor commented Aug 28, 2024

ae02a3c 我们在这个commit id修复这个问题,考虑到部分模型没有办法使用recompute策略,我们禁用了recompute策略。

@liuzhipengchd
Copy link
Author

liuzhipengchd commented Aug 29, 2024

ae02a3c 我们在这个commit id修复这个问题,考虑到部分模型没有办法使用recompute策略,我们禁用了recompute策略。

你好,我还想问个问题,在使用ranking/cross_encoder的时候,这个单塔的对于文本的先后顺序有点太敏感了。。同一对文本,改变先后顺序,计算的得分差异有点大。。有什么办法可以解决?(采用双塔可以吗)

@wawltor
Copy link
Collaborator

wawltor commented Sep 5, 2024

ae02a3c 我们在这个commit id修复这个问题,考虑到部分模型没有办法使用recompute策略,我们禁用了recompute策略。

你好,我还想问个问题,在使用ranking/cross_encoder的时候,这个单塔的对于文本的先后顺序有点太敏感了。。同一对文本,改变先后顺序,计算的得分差异有点大。。有什么办法可以解决?(采用双塔可以吗)

这个是由模型本身特性有关系,因为在模型训练过程中认为两个输入分别是query 和 document ,因此模型训练到的参数会有针对性差别;可以试试simces等双塔模型。

Copy link

github-actions bot commented Nov 5, 2024

This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动,被标记为stale。

@github-actions github-actions bot added the stale label Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale
Projects
None yet
Development

No branches or pull requests

4 participants
@wawltor @liuzhipengchd @KB-Ding and others