Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

基于paddle的lstm-crf实体识别模型训练效果很差? #5459

Closed
utopiar opened this issue Nov 8, 2017 · 8 comments
Closed

基于paddle的lstm-crf实体识别模型训练效果很差? #5459

utopiar opened this issue Nov 8, 2017 · 8 comments
Assignees
Labels
User 用于标记用户问题

Comments

@utopiar
Copy link

utopiar commented Nov 8, 2017

请问有人对比过paddle实现的lstm-crf的ner模型和TensorFlow实现的吗?我用paddle models里提供的sequence_tagging_for_ner demo在conll2003数据集上训练了500个pass结果得到的F1只有50%多(准确率95%以上,召回很低大概40%左右),但是用TensorFlow实现的跑了一个pass就能得到state of the art的结果(准召都在90%以上);是我用paddle实现的模型有问题吗?请教下paddle相关的同学在ner任务上达到过目前最好结果吗?

@typhoonzero typhoonzero added the User 用于标记用户问题 label Nov 8, 2017
@guoshengCS
Copy link
Contributor

我之前有跑过这个,不过当初chunk evaluator还不能用用的是sum evaluator(token预测的准确率),找了下之前的log,500个pass后token预测的错误率在0.0008054178906604648

@guoshengCS
Copy link
Contributor

guoshengCS commented Nov 8, 2017

感觉虽然0.0008054178906604648是以token评估的,但放在chunk里也比较低了,另外方便把你在Paddle和TensorFlow上的实现贴下么

@lcy-seso
Copy link
Contributor

lcy-seso commented Nov 8, 2017

这个具体再看下吧。我可以确定的是Paddle 的NER模型是可以跑出 state of art 的结果,我们有训练好的模型。可能是一些细节有问题。

@utopiar
Copy link
Author

utopiar commented Nov 8, 2017

TensorFlow是参考这个:
https://github.com/guillaumegenthial/sequence_tagging

paddle是参考的这个:
https://github.com/PaddlePaddle/models/tree/develop/sequence_tagging_for_ner

在相同的公开数据集上conll2003,采用的glove预训练词向量,TensorFlow上训练一个pass就能有不错的结果,目前我用paddle训练了500个pass依然没达到 state of art 的结果(召回很低),我感觉即使最终能达到 state of art 的结果那收敛上是不是也有问题,采用的都是adam优化算法

同样我在自己领域的数据集上也做了实验,发现类似的情况,Paddle实现的模型召回很低,但TensorFlow上能很快收敛,在测试集上达到不错的效果。

评估我用了统一的评估方式,都是基于chunk的方式:
https://github.com/spyysalo/conlleval.py

@lcy-seso
Copy link
Contributor

lcy-seso commented Nov 8, 2017

看了一下这两个模型的结构细节不一样,我们自己跑一下 Conll2003 NER 任务看一下吧~

@utopiar
Copy link
Author

utopiar commented Nov 8, 2017

好的,非常谢谢 @lcy-seso 辛苦有结果了同步一下

@utopiar
Copy link
Author

utopiar commented Nov 9, 2017

@lcy-seso 问下你们在conll2003 ner任务上跑出结果了么?

@guoshengCS
Copy link
Contributor

guoshengCS commented Nov 29, 2017

@utopiar models里NER的这个模型上次之后有过调整并且还在调优,可以先用这个试下 PaddlePaddle/models#504 ,差不多会有这样的效果

2133:Test with Pass 37, {'ner_chunk.precision': 0.8020243644714355, 'ner_chunk.F1-score': 0.7956012487411499, 'ner_chunk.recall': 0.7892802357673645, 'error': 0.0543203130364418}
2189:Test with Pass 38, {'ner_chunk.precision': 0.8018958568572998, 'ner_chunk.F1-score': 0.7971611022949219, 'ner_chunk.recall': 0.7924818992614746, 'error': 0.053794633597135544}
2245:Test with Pass 39, {'ner_chunk.precision': 0.8152366280555725, 'ner_chunk.F1-score': 0.7887358665466309, 'ner_chunk.recall': 0.7639037370681763, 'error': 0.05622833967208862}
2301:Test with Pass 40, {'ner_chunk.precision': 0.8162758946418762, 'ner_chunk.F1-score': 0.7978658676147461, 'ner_chunk.recall': 0.7802680134773254, 'error': 0.05402826890349388}
2357:Test with Pass 41, {'ner_chunk.precision': 0.8108761310577393, 'ner_chunk.F1-score': 0.80320805311203, 'ner_chunk.recall': 0.7956836223602295, 'error': 0.05241229012608528}
2413:Test with Pass 42, {'ner_chunk.precision': 0.8113412261009216, 'ner_chunk.F1-score': 0.8051854968070984, 'ner_chunk.recall': 0.7991225123405457, 'error': 0.05167244374752045}
2469:Test with Pass 43, {'ner_chunk.precision': 0.8263999223709106, 'ner_chunk.F1-score': 0.8146722912788391, 'ner_chunk.recall': 0.8032728433609009, 'error': 0.050406914204359055}
2525:Test with Pass 44, {'ner_chunk.precision': 0.8281795382499695, 'ner_chunk.F1-score': 0.8073907494544983, 'ner_chunk.recall': 0.7876200675964355, 'error': 0.05147774517536163}
2581:Test with Pass 45, {'ner_chunk.precision': 0.8293296694755554, 'ner_chunk.F1-score': 0.804931640625, 'ner_chunk.recall': 0.7819281220436096, 'error': 0.051886606961488724}
2637:Test with Pass 46, {'ner_chunk.precision': 0.8324659466743469, 'ner_chunk.F1-score': 0.8141549825668335, 'ner_chunk.recall': 0.7966322898864746, 'error': 0.05009540170431137}
2693:Test with Pass 47, {'ner_chunk.precision': 0.8302139043807983, 'ner_chunk.F1-score': 0.8199988007545471, 'ner_chunk.recall': 0.8100320100784302, 'error': 0.04953078180551529}
2749:Test with Pass 48, {'ner_chunk.precision': 0.8399901390075684, 'ner_chunk.F1-score': 0.8249849081039429, 'ner_chunk.recall': 0.8105063438415527, 'error': 0.0472528338432312}
2805:Test with Pass 49, {'ner_chunk.precision': 0.8354787230491638, 'ner_chunk.F1-score': 0.8215792775154114, 'ner_chunk.recall': 0.8081347346305847, 'error': 0.04890775308012962}
2861:Test with Pass 50, {'ner_chunk.precision': 0.832402229309082, 'ner_chunk.F1-score': 0.8224635720252991, 'ner_chunk.recall': 0.8127593994140625, 'error': 0.04847941920161247}
2917:Test with Pass 51, {'ner_chunk.precision': 0.8398981690406799, 'ner_chunk.F1-score': 0.8307360410690308, 'ner_chunk.recall': 0.8217716217041016, 'error': 0.045928895473480225}
2973:Test with Pass 52, {'ner_chunk.precision': 0.8304197788238525, 'ner_chunk.F1-score': 0.8292364478111267, 'ner_chunk.recall': 0.8280564546585083, 'error': 0.04678555950522423}
3029:Test with Pass 53, {'ner_chunk.precision': 0.8262006640434265, 'ner_chunk.F1-score': 0.8251706957817078, 'ner_chunk.recall': 0.8241432309150696, 'error': 0.046629805117845535}
3085:Test with Pass 54, {'ner_chunk.precision': 0.835774302482605, 'ner_chunk.F1-score': 0.8306388854980469, 'ner_chunk.recall': 0.8255662322044373, 'error': 0.04686344042420387}
3141:Test with Pass 55, {'ner_chunk.precision': 0.8486769795417786, 'ner_chunk.F1-score': 0.8368402123451233, 'ner_chunk.recall': 0.825329065322876, 'error': 0.04425450786948204}
3197:Test with Pass 56, {'ner_chunk.precision': 0.8340579867362976, 'ner_chunk.F1-score': 0.8264225721359253, 'ner_chunk.recall': 0.818925678730011, 'error': 0.046902380883693695}
3253:Test with Pass 57, {'ner_chunk.precision': 0.8324041962623596, 'ner_chunk.F1-score': 0.8323054909706116, 'ner_chunk.recall': 0.8322067856788635, 'error': 0.0452863983809948}
3309:Test with Pass 58, {'ner_chunk.precision': 0.8400906324386597, 'ner_chunk.F1-score': 0.8377430438995361, 'ner_chunk.recall': 0.8354085087776184, 'error': 0.04370935633778572}
3365:Test with Pass 59, {'ner_chunk.precision': 0.8475913405418396, 'ner_chunk.F1-score': 0.8378313779830933, 'ner_chunk.recall': 0.8282936215400696, 'error': 0.04409874975681305}
3421:Test with Pass 60, {'ner_chunk.precision': 0.8471562266349792, 'ner_chunk.F1-score': 0.8358567357063293, 'ner_chunk.recall': 0.8248547315597534, 'error': 0.043884582817554474}
3477:Test with Pass 61, {'ner_chunk.precision': 0.8372315168380737, 'ner_chunk.F1-score': 0.8345922827720642, 'ner_chunk.recall': 0.8319696187973022, 'error': 0.0447801873087883}
3533:Test with Pass 62, {'ner_chunk.precision': 0.8519247174263, 'ner_chunk.F1-score': 0.8363921642303467, 'ner_chunk.recall': 0.8214158415794373, 'error': 0.044040340930223465}
3589:Test with Pass 63, {'ner_chunk.precision': 0.8643108606338501, 'ner_chunk.F1-score': 0.8389380574226379, 'ner_chunk.recall': 0.8150124549865723, 'error': 0.04386511445045471}
3645:Test with Pass 64, {'ner_chunk.precision': 0.8598718047142029, 'ner_chunk.F1-score': 0.8432948589324951, 'ner_chunk.recall': 0.8273449540138245, 'error': 0.04261905699968338}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
User 用于标记用户问题
Projects
None yet
Development

No branches or pull requests

5 participants