Skip to content

Fix panic in Unicode wildcard matching #17753

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 29, 2020
Merged

Conversation

khwilliamson
Copy link
Contributor

The reason this bug occurs is that wildcard matching changes the anchor
assertions \A, \Z, and \z, without corresponding changes in regexec.c.

We earlier noticed that all these were being marked SIMPLE, and a
zero-width construct shouldn't really be. But it was considered too
late in the development cycle to make that change. So the plan was to
live with this bug in an experimental feature in 5.32.

But I eventually realized that the change could be effected for just the
wildcard versions, and this commit does that. If there is some issue
with making these non-SIMPLE, it will affect only the wildcard feature,
and those potential bugs are better than a known bug. I also seems
unlikely that this will introduce any bug. What removing SIMPLE does is
merely remove potential optimizations in the handling. The most general
case should work.�; it's doing an improper optimization that gets one
into trouble.

This fixes #17677

The reason this bug occurs is that wildcard matching changes the anchor
assertions \A, \Z, and \z, without corresponding changes in regexec.c.

We earlier noticed that all these were being marked SIMPLE, and a
zero-width construct shouldn't really be.  But it was considered too
late in the development cycle to make that change.  So the plan was to
live with this bug in an experimental feature in 5.32.

But I eventually realized that the change could be effected for just the
wildcard versions, and this commit does that.  If there is some issue
with making these non-SIMPLE, it will affect only the wildcard feature,
and those potential bugs are better than a known bug.  I also seems
unlikely that this will introduce any bug.  What removing SIMPLE does is
merely remove potential optimizations in the handling.  The most general
case should work.�; it's doing an improper optimization that gets one
into trouble.

This fixes #17677
@hvds
Copy link
Contributor

hvds commented Apr 28, 2020

This looks good to me, as an alternative to fixing the general case in 5.32.

@khwilliamson khwilliamson requested a review from xsawyerx April 29, 2020 04:26
@khwilliamson
Copy link
Contributor Author

I think we should postpone the general solution to 5.33 and put this fix in 5.32. It is specifically tailored to not affect anything else.

@khwilliamson khwilliamson merged commit 5efad79 into blead Apr 29, 2020
@khwilliamson khwilliamson deleted the smoke-me/khw-17677 branch April 29, 2020 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

panic: regrepeat() called with unrecognized node type 3
3 participants