You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Oct 31, 2023. It is now read-only.
Since Google Scholar alerted me to the citation and I was curious, I checked this preprint of the Deep-EOS paper. It says there:
Further high-performers such asElephant(Evang et al., 2013) orCutter(Gra ̈en et al., 2018)follow a sequence labeling approach. However,they require a prior language-dependent tokeniza-tion of the input text.
At least I interpret this as saying that the input to Elephant is already tokenized text, on which sentence boundary detection is then performed. That is not true. Elephant performs tokenization and sentence boundary detection jointly. It is true that this scenario requires tokenized training data. However, Elephant could also be trained and used on data that is not tokenized and only sentence-segmented.