-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor RoFormer Model #3049
Refactor RoFormer Model #3049
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pr comments里面的case挺实用的 需要有个入口让用户感知到这个应用场景。是不是可以做数据增强?
这个就是因为打算做一个数据增强生成式策略加上的 |
if "int" in convert_dtype(attention_mask.dtype): | ||
attention_mask = (1.0 - attention_mask) * -1e4 | ||
attention_mask = attention_mask.unsqueeze([1, 2]).expand( | ||
(-1, -1, attention_mask.shape[-1], -1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里重写update_model_kwargs_for_generation
主要是移除position_ids的更新和修改attention_mask的更新是吗,修改的attention_mask更新方式是RoFormer特殊的吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以先行合入,后面再看看attention mask如何统一调整为使用0/1表示来作为模型输入,尤其是生成这里
if "int" in convert_dtype(attention_mask.dtype): | ||
attention_mask = (1.0 - attention_mask) * -1e4 | ||
attention_mask = attention_mask.unsqueeze([1, 2]).expand( | ||
(-1, -1, attention_mask.shape[-1], -1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以先行合入,后面再看看attention mask如何统一调整为使用0/1表示来作为模型输入,尤其是生成这里
也参考 #3013 调整下Roformer单测支持新增加的这些output_attentions和output_hidden_states功能吧 |
PR types
Others
PR changes
Models
Description