Skip to content

Remove processors in forte-wrappers from Forte #454

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
May 25, 2021
Merged

Conversation

jzpang
Copy link
Collaborator

@jzpang jzpang commented May 20, 2021

This PR partially fixes #453.

Description of changes

Removed some third party processors to forte-wrapper repo, and corresponding test cases. Imports were also changed.

Possible influences of this PR.

Describe what are the possible side-effects of the code change.

Test Conducted

Describe what test cases are included for the PR.

@jzpang jzpang requested a review from hunterhector May 20, 2021 23:39
@jzpang jzpang self-assigned this May 20, 2021
@jzpang
Copy link
Collaborator Author

jzpang commented May 20, 2021

1>
NLTK prcessor was not removed in current version, the path is:
forte/processors/third_party/nltk_processors.py

Due to the effect on the following files:
tests/forte/processors/base/data_augment_replacement_processor_test.py
tests/forte/processors/data_augment/algorithms/eda_processors_test.py
tests/forte/processors/nltk_processors_test.py
tests/forte/processors/pretrained_encoder_processors_test.py
tests/forte/processors/writer_test.py

Also need to change the following import in future:
forte/processors/third_party/init.py
examples/data_augmentation/data_select/data_select_and_augment_example.py
examples/biobert_ner/ner_predict_example.py
examples/chatbot/chatbot_example.py
examples/pipelines/process_dataset_example.py
examples/pipelines/process_string_example.py
examples/serialization/serialize_example.py

2>
ElasticSearchIndexProcessor was not removed in current version, the path is:
forte/processors/ir/elastic_search_index_processor.py

Due to the effect on the following files:
forte/processors/data_augment/selector_index_processor.py

3>
TextGenerationProcessor was not removed in current version, the path is:
forte/processors/third_party/text_generation_processor.py

Due to the effect on the following files(will cause test case failure):
forte/tests/examples/gpt2_test.py

Also need to change the following files in future:
forte/examples/gpt2/multipack_pipeline_gpt2.py
forte/examples/gpt2/sample_multipack_pipeline_gpt.yml

@codecov
Copy link

codecov bot commented May 20, 2021

Codecov Report

Merging #454 (58c7ca5) into master (9278330) will decrease coverage by 0.03%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #454      +/-   ##
==========================================
- Coverage   82.77%   82.73%   -0.04%     
==========================================
  Files         214      204      -10     
  Lines       14693    14128     -565     
==========================================
- Hits        12162    11689     -473     
+ Misses       2531     2439      -92     
Impacted Files Coverage Δ
forte/processors/ir/__init__.py 100.00% <100.00%> (ø)
forte/processors/third_party/__init__.py 100.00% <100.00%> (ø)
forte/utils/utils_processor.py 95.45% <100.00%> (+31.45%) ⬆️
tests/forte/pipeline_test.py 97.99% <100.00%> (+0.03%) ⬆️
forte/data/base_pack.py 79.14% <0.00%> (-0.48%) ⬇️
forte/data/data_pack.py 79.71% <0.00%> (-0.42%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9278330...58c7ca5. Read the comment docs.

Copy link
Member

@hunterhector hunterhector left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for removing the code and also update the examples.

One thing is this causes two test coverage decrease:

  1. https://app.codecov.io/gh/asyml/forte/compare/454/changes#D1L327
    Here we probably have a test case the throws a ProcessorConfigError. Can we add a small test case back to cover this part?

  2. https://app.codecov.io/gh/asyml/forte/compare/454/changes#D4L46
    This part is because we shouldn't have used any allennlp results in the main repo now. So we should check whether we can remove these code. There are two functions about allennlp in this file, can you check if we can remove both of them?

Btw, thanks for listing out all the processors that we plan to remove, can you open some issues to track them? For each of them we can decide a strategy to remove them, and each of them can be a separate issue.

@jzpang
Copy link
Collaborator Author

jzpang commented May 22, 2021

Thanks for removing the code and also update the examples.

One thing is this causes two test coverage decrease:

  1. https://app.codecov.io/gh/asyml/forte/compare/454/changes#D1L327
    Here we probably have a test case the throws a ProcessorConfigError. Can we add a small test case back to cover this part?

I have checked the previous test cases, these lines were covered by Allennlp processor. It has a config check in initialize function, for example, https://github.com/asyml/forte/blob/master/forte/processors/third_party/allennlp_processors.py#L53.

I checked several processors that have configs in current forte repo, and they don't have similar checks. So even if we wrote similar test cases, the same error could not be raised. So, if we want to cover these lines, we need to add similar checks in the current forte processors, which means to change the current forte processors. Do we want to do this at the current step?

  1. https://app.codecov.io/gh/asyml/forte/compare/454/changes#D4L46
    This part is because we shouldn't have used any allennlp results in the main repo now. So we should check whether we can remove these code. There are two functions about allennlp in this file, can you check if we can remove both of them?

Yes I will removed them.

Btw, thanks for listing out all the processors that we plan to remove, can you open some issues to track them? For each of them we can decide a strategy to remove them, and each of them can be a separate issue.

@hunterhector
Copy link
Member

I checked several processors that have configs in current forte repo, and they don't have similar checks. So even if we wrote similar test cases, the same error could not be raised. So, if we want to cover these lines, we need to add similar checks in the current forte processors, which means to change the current forte processors. Do we want to do this at the current step?

So we now learn that it is a bad practice to test framework functions using specific implementations. In many of our other test cases, we create "dummy processors" that help us check the behavior, example like below:

https://github.com/asyml/forte/blob/master/tests/forte/pipeline_test.py#L255

@jzpang
Copy link
Collaborator Author

jzpang commented May 24, 2021

I checked several processors that have configs in current forte repo, and they don't have similar checks. So even if we wrote similar test cases, the same error could not be raised. So, if we want to cover these lines, we need to add similar checks in the current forte processors, which means to change the current forte processors. Do we want to do this at the current step?

So we now learn that it is a bad practice to test framework functions using specific implementations. In many of our other test cases, we create "dummy processors" that help us check the behavior, example like below:

https://github.com/asyml/forte/blob/master/tests/forte/pipeline_test.py#L255

Thanks for pointing this file. I added a new test for config check, it now can cover those lines.

@hunterhector hunterhector merged commit 2f49222 into master May 25, 2021
@hunterhector hunterhector deleted the remove_processors branch June 2, 2021 07:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Removing forte-wrapper processors from the Forte repo.
2 participants