Skip to content

FastNLP v0.2 #108

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 87 commits into from
Dec 7, 2018
Merged

FastNLP v0.2 #108

merged 87 commits into from
Dec 7, 2018

Conversation

FengZiYjun
Copy link
Contributor

We are excited to announce that FastNLP of version 0.2.0 is released! 🎈 🎈

Please follow our latest tutorials for detailed introduction.

Documentation and more tutorials are under construction.

Thank @yhcc , @choosewhatulike , and @xuyige for their magnificent contribution!

FengZiYjun and others added 30 commits November 29, 2018 23:27
* remove torchvision in requirements.txt
* refine interface of set_target & set_input
* rename DataSet.Instance into DataSet.DataSetIter
* remove unused methods in DataSet.DataSetIter
* remove __setattr__ in DataSet; It is dangerous.
* comment adjustment
* In init, detect content type to be Python int, float, or str.
* In append(), check type consistence.
* In init & append(), int will be cast into float if they occur together.
* Map Python type into numpy dtype
* Raise error if type detection fails.
* 增量添加单词到词典中
* lazy update: 当用到词典的时候才重新build
* 当新添加的词导致词典大小超出限制时,打印一个warning

Update Vocabulary:
* More words can be added after the building.
* Lazy update: rebuild automatically when vocab is used.
* print warning when max size is reached
* 添加fast_load_embedding方法,用vocab的词索引pre-trained中的embedding
* 如果vocab有词没出现在pre-train中,从已有embedding中正态采样

Update embed_loader:
* add fast_load_embedding method, to index pre-trained embedding with words in Vocab
* If words in Vocab are not exist in pre-trained, sample them from normal distribution computed by current embeddings
* 添加fast_load_embedding方法,用vocab的词索引pre-trained中的embedding
* 如果vocab有词没出现在pre-train中,从已有embedding中正态采样

Update embed_loader:
* add fast_load_embedding method, to index pre-trained embedding with words in Vocab
* If words in Vocab are not exist in pre-trained, sample them from normal distribution computed by current embeddings
* 添加Trainer参数metric_key,指明用来做模型选择的指标的名字
* 在Trainer添加处理tester返回的评价指标的逻辑,选择当前最好的模型
optimizer.SGD(lr=xxx);如果没有传入parameters,则在trainer中帮他加入parameter
更新Optimizer: 多种初始化方法
1. SGD()
2. SGD(0.01)
3. SGD(lr=0.01)
4. SGD(lr=0.01, momentum=0.9)
5. SGD(model.parameters(), lr=0.1, momentum=0.9)
* 添加初始化注释
* 从_better_eval_result中抽取check metrics的逻辑到_check_eval_results函数
yunfan and others added 23 commits December 4, 2018 15:54
* remove unused codes in metrics.py
* add tests for DataSet
* add tests for FieldArray
* add tests for metrics.py
* fix predictor, add tests for predictor
* fix bucket sampler, add tests for bucket sampler
* clean up unused codes
* add code comments
* merge *_saver.py & *_loader.py in io/
* (ancient codes) rename Loss into LossFromTorch
* refine code style
* fix tests
* add a new tutorial
* add DataSet.get_field(), to fetch a FieldArray based on its name
* remove old tutorials & add new tutorials
* remove conflicts
* all tests passed
@codecov-io
Copy link

Codecov Report

Merging #108 into master will increase coverage by 23.17%.
The diff coverage is 83.99%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master     #108       +/-   ##
===========================================
+ Coverage   40.66%   63.83%   +23.17%     
===========================================
  Files          76       79        +3     
  Lines        4267     5262      +995     
===========================================
+ Hits         1735     3359     +1624     
+ Misses       2532     1903      -629
Impacted Files Coverage Δ
setup.py 0% <ø> (ø) ⬆️
fastNLP/api/model_zoo.py 0% <0%> (ø) ⬆️
fastNLP/io/model_io.py 0% <0%> (ø)
test/core/test_instance.py 100% <100%> (ø) ⬆️
fastNLP/modules/encoder/char_embedding.py 100% <100%> (ø) ⬆️
fastNLP/core/instance.py 92.3% <100%> (+7.69%) ⬆️
test/api/test_processor.py 100% <100%> (ø)
fastNLP/core/predictor.py 95.65% <100%> (+95.65%) ⬆️
test/test_tutorial.py 100% <100%> (ø)
test/io/test_embed_loader.py 100% <100%> (ø)
... and 53 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 15262bd...db0a789. Read the comment docs.

@xpqiu xpqiu merged commit 1b477a9 into fastnlp:master Dec 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants