Skip to content

MultiPack to SinglePack boxer #564

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 41 commits into from
Feb 1, 2022
Merged

Conversation

VincentYaoMBZUAI
Copy link
Collaborator

@VincentYaoMBZUAI VincentYaoMBZUAI commented Nov 28, 2021

This PR fixes #561

Description of changes

This PR creates a new class "DataPackBoxer" to cast a DataPack from a MultiPack, where the DataPack is the only content of the original MultiPack, indexed by the attributepack_name. It would be able to auto-box the multi-pack into a data-pack by simple 'getting pack' and returning an iterator that produces the boxed data-pack.

Possible influences of this PR.

Forte already has a MultiPackBoxer that can cast DataPack into MultiPack. The PR will be able to add a new caster that can perform the opposite conversion, casting MultiPack to DataPack.

Test Conducted

A test case "test_datapack_boxer()" is designed and added in "data_type_infer_test.py", which can be used to test the caster by checking the output_pack_type is DataPack while input_pack_type is MultiPack.

@codecov
Copy link

codecov bot commented Nov 28, 2021

Codecov Report

Merging #564 (bbc763b) into master (4bb8fa5) will increase coverage by 0.05%.
The diff coverage is 96.49%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #564      +/-   ##
==========================================
+ Coverage   80.62%   80.68%   +0.05%     
==========================================
  Files         237      238       +1     
  Lines       16930    16986      +56     
==========================================
+ Hits        13650    13705      +55     
- Misses       3280     3281       +1     
Impacted Files Coverage Δ
forte/data/base_pack.py 77.77% <ø> (ø)
forte/data/caster.py 91.66% <93.33%> (+0.75%) ⬆️
tests/forte/datapack_boxer_test.py 97.22% <97.22%> (ø)
tests/forte/data/datapack_type_infer_test.py 100.00% <100.00%> (ø)
forte/data/multi_pack.py 78.54% <0.00%> (+0.42%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4bb8fa5...bbc763b. Read the comment docs.

@hunterhector
Copy link
Member

the PR message need to be updated as the following:

This PR fixes [https://github.com/asyml/forte/issues/561].

This PR fixes https://github.com/asyml/forte/issues/561.

@hunterhector hunterhector changed the title Bug561fix MultiPack to SinglePack boxer Dec 7, 2021
@hunterhector
Copy link
Member

The correct way to write the PR message is as follow:

This PR fixes https://github.com/asyml/forte/issues/561

You need to remove the square brackets

@@ -0,0 +1,80 @@
#begin document (bn/abc/00/abc_0039); part 000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can reuse the current dataset without adding a new file

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should remove this file if it is no longer needed

Copy link
Member

@hunterhector hunterhector left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks fine to me, I think we just need to remove the redundant file and fix the CI to merge this.

@hunterhector
Copy link
Member

btw, besides removing the redundant ontonotes/00_1, we should also update the branch to make it up to date.

@hunterhector hunterhector merged commit b46cea9 into asyml:master Feb 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add DataPackBoxer in Caster
3 participants