Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add text semantic matching for taskflow #3003

Merged
merged 14 commits into from
Aug 19, 2022

Conversation

w5688414
Copy link
Contributor

@w5688414 w5688414 commented Aug 9, 2022

PR types

  • New features

PR changes

  • Models

Description

  • Add text semantic matching for Taskflow

@@ -1324,6 +1325,32 @@ from paddlenlp import Taskflow
* `output_scores`:是否要输出解码得分,请默认为False。
</div></details>

### 文本语义相似度
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是否可以和已有的文本相似度任务合并,通过Taskflow("text_similarity", model="XXX")选择不同的backbone

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已经合并

"models": {
"rocketqa-zh-dureader-cross-encoder": {
"task_class": SemanticMatchingTask,
"task_flag": 'semantic_matching-cross-encoder',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

task_flag的模型名称建议和model对齐

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

"rocketqa-zh-dureader-cross-encoder": {
"task_class": TextSimilarityTask,
"task_flag": 'rocketqa-zh-dureader-cross-encoder',
},
},
"default": {
"model": "simbert-base-chinese"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

默认模型这里可以改为rocketqa-zh-dureader-cross-encoder

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

@@ -192,6 +192,10 @@
"task_class": TextSimilarityTask,
"task_flag": "text_similarity-simbert-base-chinese"
},
"rocketqa-zh-dureader-cross-encoder": {
"task_class": TextSimilarityTask,
"task_flag": 'rocketqa-zh-dureader-cross-encoder',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'rocketqa-zh-dureader-cross-encoder' -> 'text_similarity-rocketqa-zh-dureader-cross-encoder'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改


#### 单条输入

```python
>>> from paddlenlp import Taskflow
>>> similarity = Taskflow("text_similarity")
>>> similarity = Taskflow("text_similarity",model="rocketqa-zh-dureader-cross-encoder")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议默认调用不加model参数进行简化,直接similarity = Taskflow("text_similarity"),然后参考UIE提供一个模型选择的表格

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

Copy link

@tianxin1860 tianxin1860 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave a comment

Comment on lines 68 to 72
if ('rocketqa' not in model):
self._check_task_files()
self._get_inference_model()
else:
self._construct_model(model)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么 RocketQA cross encoder 模型不支持走动转静之后的高性能预测?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已添加

Copy link

@tianxin1860 tianxin1860 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@w5688414 w5688414 merged commit 4bbf1b0 into PaddlePaddle:develop Aug 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants