FastNLP v0.2 #108

FengZiYjun · 2018-12-07T11:25:59Z

We are excited to announce that FastNLP of version 0.2.0 is released! 🎈 🎈

Please follow our latest tutorials for detailed introduction.

Documentation and more tutorials are under construction.

Thank @yhcc , @choosewhatulike , and @xuyige for their magnificent contribution!

* remove torchvision in requirements.txt

* refine interface of set_target & set_input * rename DataSet.Instance into DataSet.DataSetIter * remove unused methods in DataSet.DataSetIter * remove __setattr__ in DataSet; It is dangerous. * comment adjustment

* In init, detect content type to be Python int, float, or str. * In append(), check type consistence. * In init & append(), int will be cast into float if they occur together. * Map Python type into numpy dtype * Raise error if type detection fails.

* 增量添加单词到词典中 * lazy update: 当用到词典的时候才重新build * 当新添加的词导致词典大小超出限制时，打印一个warning Update Vocabulary: * More words can be added after the building. * Lazy update: rebuild automatically when vocab is used. * print warning when max size is reached

* 添加fast_load_embedding方法，用vocab的词索引pre-trained中的embedding * 如果vocab有词没出现在pre-train中，从已有embedding中正态采样 Update embed_loader: * add fast_load_embedding method, to index pre-trained embedding with words in Vocab * If words in Vocab are not exist in pre-trained, sample them from normal distribution computed by current embeddings

…HEAD

…check

* 添加fast_load_embedding方法，用vocab的词索引pre-trained中的embedding * 如果vocab有词没出现在pre-train中，从已有embedding中正态采样 Update embed_loader: * add fast_load_embedding method, to index pre-trained embedding with words in Vocab * If words in Vocab are not exist in pre-trained, sample them from normal distribution computed by current embeddings

* 添加Trainer参数metric_key，指明用来做模型选择的指标的名字 * 在Trainer添加处理tester返回的评价指标的逻辑，选择当前最好的模型

# Conflicts: # fastNLP/core/trainer.py

optimizer.SGD(lr=xxx);如果没有传入parameters，则在trainer中帮他加入parameter

…check

…is to concat all data before calculation.

更新Optimizer: 多种初始化方法 1. SGD() 2. SGD(0.01) 3. SGD(lr=0.01) 4. SGD(lr=0.01, momentum=0.9) 5. SGD(model.parameters(), lr=0.1, momentum=0.9)

* 添加初始化注释 * 从_better_eval_result中抽取check metrics的逻辑到_check_eval_results函数

* remove unused codes in metrics.py * add tests for DataSet * add tests for FieldArray * add tests for metrics.py * fix predictor, add tests for predictor * fix bucket sampler, add tests for bucket sampler

…True

* clean up unused codes

* add code comments * merge *_saver.py & *_loader.py in io/ * (ancient codes) rename Loss into LossFromTorch

* refine code style * fix tests * add a new tutorial

* add DataSet.get_field(), to fetch a FieldArray based on its name * remove old tutorials & add new tutorials

* remove conflicts * all tests passed

codecov-io · 2018-12-07T11:28:50Z

Codecov Report

Merging #108 into master will increase coverage by 23.17%.
The diff coverage is 83.99%.

@@             Coverage Diff             @@
##           master     #108       +/-   ##
===========================================
+ Coverage   40.66%   63.83%   +23.17%     
===========================================
  Files          76       79        +3     
  Lines        4267     5262      +995     
===========================================
+ Hits         1735     3359     +1624     
+ Misses       2532     1903      -629

Impacted Files	Coverage Δ
setup.py	`0% <ø> (ø)`	⬆️
fastNLP/api/model_zoo.py	`0% <0%> (ø)`	⬆️
fastNLP/io/model_io.py	`0% <0%> (ø)`
test/core/test_instance.py	`100% <100%> (ø)`	⬆️
fastNLP/modules/encoder/char_embedding.py	`100% <100%> (ø)`	⬆️
fastNLP/core/instance.py	`92.3% <100%> (+7.69%)`	⬆️
test/api/test_processor.py	`100% <100%> (ø)`
fastNLP/core/predictor.py	`95.65% <100%> (+95.65%)`	⬆️
test/test_tutorial.py	`100% <100%> (ø)`
test/io/test_embed_loader.py	`100% <100%> (ø)`
... and 53 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 15262bd...db0a789. Read the comment docs.

FengZiYjun and others added 30 commits November 29, 2018 23:27

* update README.md

117b12a

* remove torchvision in requirements.txt

* DataSet __getitem__ returns copy of Instance

da901ed

* refine interface of set_target & set_input * rename DataSet.Instance into DataSet.DataSetIter * remove unused methods in DataSet.DataSetIter * remove __setattr__ in DataSet; It is dangerous. * comment adjustment

add interface of Loss

07e227a

trainer迭代

3d91f2f

Merge branch 'trainer' of github.com:FengZiYjun/fastNLP into trainer

fe0f99b

update LossBase class

d8a80ad

Merge branch 'trainer' of https://github.com/FengZiYjun/fastNLP into …

fc505b8

…HEAD

增加metric

ad0a8c1

update LossBase class

37e282d

Merge branch 'trainer' of github.com:FengZiYjun/fastNLP into trainer

7c439e7

Merge branch 'trainer' of https://github.com/FengZiYjun/fastNLP into …

84eb50a

…check

add _method_function

2c8bd95

CheckError add function

0d4720b

更新Trainer:

e5e7f29

* 添加Trainer参数metric_key，指明用来做模型选择的指标的名字 * 在Trainer添加处理tester返回的评价指标的逻辑，选择当前最好的模型

Merge remote-tracking branch 'FengZiYjun/trainer' into trainer

08375d5

# Conflicts: # fastNLP/core/trainer.py

更新Optimizer:

8a7077f

optimizer.SGD(lr=xxx);如果没有传入parameters，则在trainer中帮他加入parameter

update LossBase class

6d36190

Merge branch 'trainer' of https://github.com/FengZiYjun/fastNLP into …

ba7b176

…check

trainer and tester change check_code

3a4a729

LossInForward update

3daa889

conflict in trainer solved

1b961f1

change the calculation of metric to batch by batch. The older design …

f24fca1

…is to concat all data before calculation.

metrics中实现AccuracyMetric, 并将metric的计算方式由一把计算修改为batch by batch

bd94dd2

_prepare_metric函数增加检查evaluate与get_metric方法

84024aa

fix bug in Trainer about metric_key

fb5215a

更新Optimizer: 多种初始化方法 1. SGD() 2. SGD(0.01) 3. SGD(lr=0.01) 4. SGD(lr=0.01, momentum=0.9) 5. SGD(model.parameters(), lr=0.1, momentum=0.9)

Trainer Update:

d74901e

* 添加初始化注释 * 从_better_eval_result中抽取check metrics的逻辑到_check_eval_results函数

yunfan and others added 23 commits December 4, 2018 15:54

fix bugs

5edd9de

FieldArray only check type when is_input or is_target is set.

27833d0

test loss

62c63f1

conflict solved

79ae387

fix bugs in vocab

52b1b18

。

87e5d44

Merge branch 'trainer' of github.com:FengZiYjun/fastNLP into trainer

7c261fa

* 更新教程，放在在./tutorial

f26f116

* remove unused codes in metrics.py * add tests for DataSet * add tests for FieldArray * add tests for metrics.py * fix predictor, add tests for predictor * fix bucket sampler, add tests for bucket sampler

修改losses中直接使用F.cross_entropy的情况，因为这些函数的signature是(input, target)

4dff3ec

fix FieldArray bug: do type check only when is_target or is_input is …

5855adb

…True

conflict fix

e779409

Merge branch 'trainer' of github.com:FengZiYjun/fastNLP into trainer

f7c29b8

1. 优化trainer checkcode过程的报错信息

1158556

1. trainer中losser修改为loss

aea9318

* fix tests

6129a31

* clean up unused codes

bug fix in LossInForward

cd83866

Merge branch 'trainer' of github.com:FengZiYjun/fastNLP into trainer

306eee9

* fix processor.py

27e9453

* add code comments * merge *_saver.py & *_loader.py in io/ * (ancient codes) rename Loss into LossFromTorch

optimizer初始化调整参数顺序

72877c6

* remove unused codes in losses.py & metrics.py

447746d

* refine code style * fix tests * add a new tutorial

* rename DataSet.get_fields() into get_all_fields()

720a264

* add DataSet.get_field(), to fetch a FieldArray based on its name * remove old tutorials & add new tutorials

add dataloader register

267baec

* final clean up

db0a789

* remove conflicts * all tests passed

FengZiYjun requested review from xpqiu, xuyige, choosewhatulike and yhcc December 7, 2018 11:25

xpqiu approved these changes Dec 7, 2018

View reviewed changes

xpqiu merged commit 1b477a9 into fastnlp:master Dec 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FastNLP v0.2 #108

FastNLP v0.2 #108

Uh oh!

FengZiYjun commented Dec 7, 2018

Uh oh!

codecov-io commented Dec 7, 2018

Uh oh!

Uh oh!

FastNLP v0.2 #108

FastNLP v0.2 #108

Uh oh!

Conversation

FengZiYjun commented Dec 7, 2018

Uh oh!

codecov-io commented Dec 7, 2018

Codecov Report

Uh oh!

Uh oh!