Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Stream decoding with "stop" as the error behavior #498

Closed
chris-martin opened this issue Jan 27, 2023 · 2 comments · Fixed by #448
Closed

Feature request: Stream decoding with "stop" as the error behavior #498

chris-martin opened this issue Jan 27, 2023 · 2 comments · Fixed by #448

Comments

@chris-martin
Copy link

chris-martin commented Jan 27, 2023

I am considering a situation where I have a ByteString stream that may be UTF-8 up to some unknown point, and I'd like to be able to do a streaming decode of as much Text as possible for as long as the input is valid, and then stop at the first sign of trouble, obtaining both the decoded Text and the non-UTF8 ByteString remainder.

I envision something like this:

streamDecodeUtf8' :: ByteString -> Decoding'

data Decoding' = Some'
    Text -- ^ What was decoded
    ByteString -- ^ Remainder that was not decoded
    (Maybe UnicodeException)
        -- ^ 'Just' an exception if the remainder is non-empty
        -- because it begins with invalid input.
        -- 'Nothing' if the remainder is empty or is non-empty
        -- but could become valid with more input.
@Lysxia
Copy link
Contributor

Lysxia commented Jan 27, 2023

This is being worked on #448

The API there is more complicated, because (1) returning a Text forces you to do a copy and (2) returning the remainder as a ByteString forces you to append to the next chunk to resume. But I think it's still possible make it look closer to what you are proposing while leaving the user in control of how the copying to Text is done.

@chris-martin
Copy link
Author

chris-martin commented Jan 27, 2023

returning the remainder as a ByteString forces you to append to the next chunk to resume

Yes, the existing stream API gives you, in addition to the ByteString remainder, a function that lets you continue without having to concatenate, and there's no reason I should have proposed changing that aspect. A better attempt would be:

streamDecodeUtf8Strict :: ByteString -> StrictDecoding

data StrictDecoding = StrictDecoding
    Text -- ^ What was decoded
    ByteString -- ^ Remainder that was not decoded
    (Either UnicodeException (ByteString -> StrictDecoding))

@Lysxia Lysxia linked a pull request Feb 4, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants