For UTF-7, flag unnecessary extra trailing byte in Base64 section as error #9977

alexdowad · 2022-11-19T19:02:48Z

This bug was found when I was fuzzing a patch related to mb_strpos. In some cases, the legacy text conversion code for UTF-7 (and UTF7-IMAP) would correctly recognize an error for a Base64-encoded section which was not correctly padded with zero bits, but the new (and faster) text conversion code would not.

Specifically, if the input string ended abruptly after the 4th or 7th byte of a Base64-encoded section, the new conversion code would confirm that the trailing padding bits from the previous byte (3rd or 6th) were zeroes, but would not check whether the 4th or 7th byte itself encoded any non-zero bits. The legacy conversion code did perform this check and would treat the input string as invalid.

Actually, even if the 4th or 7th byte does encode only (padding) zero bits, this is still a problem, because there is no reason to have a 4th (or 7th) byte in that case. The UTF-7 string should have ended on the previous byte instead.

Apply the same fix for both UTF-7 and UTF7-IMAP. Also add regression test cases.

FYA @cmb69 @Girgias @nikic

alexdowad · 2022-11-19T19:03:32Z

Hmm, I actually intend to merge this into PHP-8.2 and then down into master, but forgot to set the target branch when opening the PR. Sigh.

…error This bug was found when I was fuzzing a patch related to mb_strpos. In some cases, the legacy text conversion code for UTF-7 (and UTF7-IMAP) would correctly recognize an error for a Base64-encoded section which was not correctly padded with zero bits, but the new (and faster) text conversion code would not. Specifically, if the input string ended abruptly after the 4th or 7th byte of a Base64-encoded section, the new conversion code would confirm that the trailing padding bits from the previous byte (3rd or 6th) were zeroes, but would not check whether the 4th or 7th byte itself encoded any non-zero bits. The legacy conversion code did perform this check and would treat the input string as invalid. Actually, even if the 4th or 7th byte does encode only (padding) zero bits, this is still a problem, because there is no reason to have a 4th (or 7th) byte in that case. The UTF-7 string should have ended on the previous byte instead. Apply the same fix for both UTF-7 and UTF7-IMAP.

cmb69

Thank you!

Hmm, I actually intend to merge this into PHP-8.2 and then down into master, but forgot to set the target branch when opening the PR.

You could still change the base branch (Edit button besides the PR title), but you would need to force push as well (other AppVeyor CI may not work). However, I think this is good to be applied to PHP-8.2 right away.

alexdowad · 2022-11-21T12:50:34Z

Thanks so much for the review. Merged.

github-actions bot added the Extension: mbstring label Nov 19, 2022

alexdowad force-pushed the utf7_fix branch from 8eb1726 to 70d5ad8 Compare November 20, 2022 04:37

cmb69 approved these changes Nov 21, 2022

View reviewed changes

Girgias approved these changes Nov 21, 2022

View reviewed changes

alexdowad closed this Nov 21, 2022

alexdowad deleted the utf7_fix branch December 14, 2022 12:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

For UTF-7, flag unnecessary extra trailing byte in Base64 section as error #9977

For UTF-7, flag unnecessary extra trailing byte in Base64 section as error #9977

Uh oh!

alexdowad commented Nov 19, 2022

Uh oh!

alexdowad commented Nov 19, 2022

Uh oh!

cmb69 left a comment

Uh oh!

alexdowad commented Nov 21, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

For UTF-7, flag unnecessary extra trailing byte in Base64 section as error #9977

For UTF-7, flag unnecessary extra trailing byte in Base64 section as error #9977

Uh oh!

Conversation

alexdowad commented Nov 19, 2022

Uh oh!

alexdowad commented Nov 19, 2022

Uh oh!

cmb69 left a comment

Choose a reason for hiding this comment

Uh oh!

alexdowad commented Nov 21, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants