Self-contradictory summary in spacy debug data #8035
-
I want to train a textcat model and I got this message when I execute the commande
There are 2 things that seem strange to me:
I was converting prodigy doc_bin = DocBin()
for file_name in file_names:
for eg in srsly.read_jsonl(os.path.join(input_path, file_name)):
doc = nlp.make_doc(eg["text"])
label = eg["label"]
score = eg.get("score")
if score is None:
if eg.get("answer") == "accept":
score = 1.0
else:
score = 0.0
doc.cats = {label: score}
doc_bin.add(doc) All the data in jsonl files have the unique label 'NEGATION'. I don't understand why this is not working. Did i do something wrong in the conversion or it's an error in Spacy version : 3.0.6 Thanks for your response. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 4 replies
-
Thanks for the note, we'll have another look at the debug messages. In this case, the warning message is correct given the format described above. If you have a binary classification task, you can use two labels with The If there's not already a warning here, we should consider adding one when you start training with just one label, since the model is not going to be useful. Is this the problem you were running into? |
Beta Was this translation helpful? Give feedback.
Thanks for the note, we'll have another look at the debug messages.
In this case, the warning message is correct given the format described above.
If you have a binary classification task, you can use two labels with
textcat
(one label is 1.0 for each instance and one label is 0.0, soNEGATION
andNOT_NEGATION
) or you can use one label withtextcat_multilabel
(justNEGATION
as 1.0 or 0.0 as you have above).The
textcat_multilabel
labels can be 0.0 or 1.0 for each label individually, but fortextcat
there should always be exactly one label per document with a score of 1.0 and all other labels should be 0.0. Thetextcat
model always predicts scores that sum to 1.0 over all labels, so the mo…