Skip to content

Token as returned by pipe() is relative to the sentence boundaries #53

Closed
@drahnr

Description

@drahnr
// Token<'_>
    pub char_span: (usize, usize),
    pub byte_span: (usize, usize),

using fn pipe() returns a set of tokens, that includes spans relative to the sentence, but there seems to be no trivial way of retrieving the spans from within the original text provided to pipe.

Suggestion: Use a Range<usize> instead of a tuple for the relevant range of bytes/ characters for easier usage and make that relative to the input text.

Since for single sentences, there is no change in semantics. For multi sentence ones there is.

It would also make sense to add the respective bounds in bytes and chars of the sentence (or replace the sentence entirely).

pub sentence: &'t str,

Related cargo spellcheck issue drahnr/cargo-spellcheck#162

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions