You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm utilising spacy-llm with GPT-3.5 Turbo 16k for NER (spacy.NER.v2). While the pipeline usually works as expected, identifying entities in doc.ents, there are instances where doc.ents returns empty, even though entities are present in the model's output (I've set save_io = true). This seems to occur when entities in the raw output are separated by hyphens instead of commas.
The parser does not seem to handle model outputs formatted with entities listed under a category and separated by hyphens.
Where can I find and modify the parser in the spacy-llm pipeline to account for variations in entity formatting in the model's output? Specifically, how can it be adjusted to parse entities separated by hyphens as well as those separated by commas? Do you have any sueggestions?
The text was updated successfully, but these errors were encountered:
I'm utilising spacy-llm with GPT-3.5 Turbo 16k for NER (spacy.NER.v2). While the pipeline usually works as expected, identifying entities in doc.ents, there are instances where doc.ents returns empty, even though entities are present in the model's output (I've set save_io = true). This seems to occur when entities in the raw output are separated by hyphens instead of commas.
Examples of Issue:
Incorrectly Parsed Output:
DOC ENTS: ()
Component: llm
Response: Medical Condition:
Correctly Parsed Output:
DOC ENTS: (coughs, eczema, coughs, coughs, coughs, diabetes)
[ ('coughs', 'Medical Condition'), ('eczema', 'Medical Condition'), ('coughs', 'Medical Condition'), ('coughs', 'Medical Condition'), ('coughs', 'Medical Condition'), ('diabetes', 'Medical Condition')]
Component: llm
Response: Medical Condition: coughs, eczema, diabetes
The parser does not seem to handle model outputs formatted with entities listed under a category and separated by hyphens.
Where can I find and modify the parser in the spacy-llm pipeline to account for variations in entity formatting in the model's output? Specifically, how can it be adjusted to parse entities separated by hyphens as well as those separated by commas? Do you have any sueggestions?
The text was updated successfully, but these errors were encountered: