-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Address infinite loops in Transliterator runtime #3974
Comments
Are the first two cases checked at datagen time? I guess not terminating is like panicking, because eventually we'll probably run out of memory and cause an allocator panic. So we have to guard against it even at runtime. |
No. Here are my thoughts:
Regarding |
Another thought: Thinking about this some more (in particular how to avoid infinite loops in recursive transliterators), I changed my mind and I do think we should just disallow any form of cyclic recursion with transliterators, even if a specific case might not cause an infinite loop. This allows us to not have to do recursion checking during transliteration, but during construction as with |
Even if we were able to throw away all non-terminating transliterators during datagen (we can't), corrupted/malicious data could still cause infinite loops in the runtime.
I know of these ways that could happen:
a > a|a ;
. ICU4C/J solves this by applying a maximum target length of 16x the input (see here).Any-Example
contains the rule:: Any-Example ;
(or calls a nested transliterator that contains that rule). This might be more tricky to catch at runtime(a) > &Any-Example($1);
- this is not guaranteed to be an infinite loop, though, as the recursive call only happens if the LHS is matched.VarTable
lookup - corrupted data could contain a cyclicVarTable
, i.e.,$a = $b ; $b = $a
if that were to parse without errors (I catch this during datagen).The text was updated successfully, but these errors were encountered: