-
Notifications
You must be signed in to change notification settings - Fork 1
Allow ref_to_token_annotations to handle more general cases #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow ref_to_token_annotations to handle more general cases #7
Conversation
Previously this command would only work in the parsing scenario of converting reference spans (BI, BE, IE, II) spans to token spans (b-r, i-r, e-r, o). This commit allows it also to be used for parsing spans where a reference span (author) can be converted to a series of token spans (author).
66958a0 to
115ada2
Compare
* Improve documentation. * Output documents which have no annotations, instead of removing them.
| labels can be applied directly to the individual tokens contained within | ||
| these multi-token spans; for each token in the multi-token span, a span | ||
| is created with the same label. Symbolically: | ||
| * [author author author] becomes [author][author][author] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not unique to author right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exactly 👍
nsorros
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Previously
ref_to_token_annotationswould only work in the splitting scenario of converting reference spans (BI,BE,IE,II) spans to token spans (b-r,i-r,e-r,o). This PR allows it also to be used for parsing spans where a reference span (author) can be converted to a series of token spans (author).Relevant tests are added, along with improved documentation to the command itself.