Determining dead state without using next_eoi_state() for streaming #1317
-
|
I am trying to handle the case of matching regexes while streaming text input, and outputting matches as early as possible, with the possibility that more input might be added. To accomplish this, I am using regex-automata and iterating through my input character-by-character. However because of the one-byte-behind nature of the DFA I cannot easily distinguish determine if the DFA is in a terminal state after running through all currently available input. For instance, suppose I have a DFA based on I created a playground/test case to illustrate what I mean: https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=5e109d8c06def6f4403a573cb184d5fc Hoping this was clear enough, I've only been using this crate for a couple days so apologies if I misunderstood how to use the API. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 9 replies
-
|
The delay just means the match can't be reported until the next byte is fed to it. I don't understand the problem you're having? The program you submitted seems to be working as intended. Please consider providing a program that doesn't work as intended. Perhaps with a failing assertion or a clear explanation of expected output vs actual output. To be clear, the EOI transition should only be followed when the input is finished and no more bytes will be fed to the state machine. |
Beta Was this translation helpful? Give feedback.
It is 100% not a bug. I'm confused as to where you are confused. You already mentioned the 1 byte delay, so you know about that. And that is why there is a delay here. If you extend the input by one more byte, you can see it enter a dead state: https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=5a35c4fd4a94eb64799d392d1cadbac2
Maybe the thing you are confused by is that you as a human can see that the regex can never match. But the engine itself cannot because the 1 byte delay is built into the finite state machine, even when a pattern doesn't have any look around assertions.