-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update neural search readme and Add Paddle Serving Support #1558
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
确认下提交代码是否有自动yapf格式化
@@ -447,6 +451,58 @@ sh deploy.sh | |||
[0.959269642829895, 0.04725276678800583] | |||
``` | |||
|
|||
### Paddle Serving部署 | |||
|
|||
首先把PaddleInference转换成Serving的格式: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
把paddle infernece转换这句话不准确。
应该是将静态图模型转换成Serving格式
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经修改
@@ -0,0 +1,32 @@ | |||
#worker_num, 最大并发数。当build_dag_each_worker=True时, 框架会创建worker_num个进程,每个进程内构建grpcSever和DAG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这些中文注释是我们额外加的还是paddle serving自带的?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
中文注释是serving的examples里面有的,我只是修改了其中少量的参数
rpc_port: 9998 | ||
op: | ||
bert: | ||
#并发数,is_thread_op=True时,为线程并发;否则为进程并发 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
注释 #
后面需要带一个空格
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经修改
|
||
def postprocess(self, input_dicts, fetch_dict, data_id, log_id): | ||
new_dict = {} | ||
new_dict["elementwise_div_1"] = str(fetch_dict["elementwise_div_1"].tolist()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这些地方奇怪的变量名操作,给人开发体验确实不太好
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
基本属于我们不写的话,开发者不可能自行搞明白
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经修改,这个要在导出的serving文件里面修改,或者在导出serving格式文件的时候进行指定
): fn(samples) | ||
input_ids, segment_ids = batchify_fn(examples) | ||
feed_dict = {} | ||
feed_dict['input_ids']=input_ids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
你的代码是不是都自动过yapf就提交上来了,格式化都不准确
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
修改过来了,现在已经格式化
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leave some comment
|—— deploy | ||
|—— python | ||
|—— predict.py # PaddleInference | ||
|—— deploy.sh # Paddle Inference部署脚本 | ||
|—— inference.py # 动态图抽取向量 | ||
|
||
|—— export_to_serving.py # 静态图转Serving |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
目录结构咱们整体再讨论下,serving 相关代码是否放在 deploy 目录下语义更清楚一些?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经调整
|
||
|—— export_to_serving.py # 静态图转Serving | ||
|—— rpc_client.py # Paddle Serving的Client端 | ||
|—— web_service.py # Paddle Serving的 Serving端 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
中英字符间的空格统一吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经调整
--params_filename "inference.get_pooled_embedding.pdiparams" \ | ||
--server_path "./serving_server" \ | ||
--client_path "./serving_client" \ | ||
--fetch_alias_names "output_embed" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里需要补充下对 export_to_serving.py 各参数的含义说明用户更容易理解一些。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经修改
|
||
启动客户端调用 Server。 | ||
|
||
首先修改需要预测的样本,并把它放入到 feed 字典中: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
修改需要预测的样本
是什么意思?做了什么修改?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经修改
parser.add_argument("--model_filename", type=str, required=True, | ||
default='inference.get_pooled_embedding.pdmodel', help="The path to model parameters to be loaded.") | ||
parser.add_argument("--params_filename", type=str, required=True, | ||
default='inference.get_pooled_embedding.pdiparams', help="The path to model parameters to be loaded.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这 2 个参数的 help 描述不对吧?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经修改
feed["0"] = "国有企业引入非国有资本对创新绩效的影响——基于制造业国有上市公司的经验证据" | ||
feed["1"] = "试论翻译过程中的文化差异与语言空缺翻译过程,文化差异,语言空缺,文化对比" | ||
print(feed) | ||
ret = client.predict(feed_dict=feed, fetch=["res"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为什么 client 发送的数据必须是字典形式?而不是 List[String] ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
代码做了调整
return result | ||
|
||
|
||
class BertOp(Op): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BertOp 这个类名可能需要改一下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经修改
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leave some comments
feed[str(i)] = item | ||
|
||
print(feed) | ||
ret = client.predict(feed_dict=feed, fetch=["res"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
res
的命名是在哪里确定的?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已删除,测试了一下,可以不加
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
from paddle_serving_server.web_service import WebService, Op |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
第三方库的 import 应该在系统库 import 下面,并空行分开
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经修改
self.tokenizer = ppnlp.transformers.ErnieTokenizer.from_pretrained( | ||
'ernie-1.0') | ||
|
||
def preprocess(self, input_dicts, data_id, log_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data_id、log_id 看起来为被使用?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
preprocess 的参数是前继 Channel 中的数据 input_dicts,该变量(作为一个 sample)是一个以前继 OP 的 name 为 Key,对应 OP 的输出为 Value 的字典。
process 的参数是 Paddle Serving Client 预测接口的输入变量 fetch_dict_list(preprocess 函数的返回值的列表),该变量(作为一个 batch)是一个列表,列表中的元素为以 feed_name 为 Key,对应 ndarray 格式的数据为 Value 的字典。typical_logid 作为向 PaddleServingService 穿透的 logid。
postprocess 的参数是 input_dicts 和 fetch_dict,input_dicts 与 preprocess 的参数一致,fetch_dict (作为一个 sample)是 process 函数的返回 batch 中的一个 sample(如果没有执行 process ,则该值为 preprocess 的返回值)。
|
||
def postprocess(self, input_dicts, fetch_dict, data_id, log_id): | ||
new_dict = {} | ||
new_dict["output_embed"] = str(fetch_dict["output_embed"].tolist()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
output_embed
变量名在哪里确定?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在export_to_serving.py输入时指定,也可以选择默认,详细请看readme
'ernie-1.0') | ||
|
||
def preprocess(self, input_dicts, data_id, log_id): | ||
from paddlenlp.data import Stack, Tuple, Pad |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import 为什么放在这里?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
多进程测试有问题,因为这里会进入子进程,paddlenlp的一些操作调用了Paddle,在子进程里需要禁用Paddle。这样做了之后就可以避免多进程的问题
|
||
|
||
class ErnieService(WebService): | ||
def get_pipeline_response(self, read_op): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这是 paddle_serving 约定的服务类必须提供的接口函数么?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WebService作为基类,提供将用户接受的HTTP请求转化为RPC输入的接口preprocess,同时提供对RPC请求返回的结果进行后处理的接口postprocess,继承WebService的子类,可以定义各种类型的成员函数。WebService的启动命令和普通RPC服务提供的启动API一致,重写preprocess和postprocess接口,实现模型预测前、预测后处理方法即可。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
New features and Bug fixes
PR changes
Docs
Description
Add Paddle Serving Support
Update readme