Skip to content

Bug fix and enhancements for CoNLLGenerator annotator #13053

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

maziyarpanahi
Copy link
Member

Description

Test for non-int metadata values in CoNLLGenerator
Fix for non-int metadata values bug in CoNLLGenerator
Include escaping when writing csv in order to preserve special char tokens
Remove unnecessary option from csv write
Adding metadata sentence key parameter in order to select which metadata field to use as a sentence for CoNLL Generation
Minor formatting in scala, Python refactorization (several bug fixes supporting scala overloaded methods), Python Tests for 2 and 3 arguments

Motivation and Context

Issue 13004

How Has This Been Tested?

Tested existing Projects using the 2 arguments alternative (the only one supported so far)
Created new tests to test this functionality

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • Code improvements with no or little impact
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING page.
  • I have added tests to cover my changes.
  • Reported scala failing tests not because of this PR, but failing from before.
  • All new and existing tests passed.

maziyarpanahi and others added 2 commits November 8, 2022 14:30
…3051)

* Test for non-int metadata values in CoNLLGenerator

* Fix for non-int metadata values bug in CoNLLGenerator

* Include escaping when writing csv in order to preserve special char tokens

* Remove unnecessary option from csv write

* Addin metadata sentence key parameter in order to selecti which metadata field to use as sentence for CoNLL Generation

* Minor formatting in scala, Python refactorization (several bug fixes supporting scala overloaded methods), Python Tests for 2 and 3 arguments
@maziyarpanahi maziyarpanahi self-assigned this Nov 9, 2022
@maziyarpanahi maziyarpanahi changed the base branch from master to release/423-release-candidate November 9, 2022 09:48
@maziyarpanahi maziyarpanahi merged commit 2a382ce into release/423-release-candidate Nov 10, 2022
@KshitizGIT KshitizGIT deleted the SPARKNLP-645-fix-a-bug-in-co-nll-generator-annotator branch March 2, 2023 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

exportConllFiles from CoNLLGenerator failes when the token has non-int metadata
2 participants