PKCS#12 PBE doesn't convert to UTF-16BE correctly #337

magnumripper · 2019-12-22T10:12:18Z

The PKCS#12 PBE function cheats when "converting" to UTF-16BE. It simply casts char -> short, or "inserts every other zero". While this happens to be correct for ISO-8859-1, it's not for the remaining 99.8% of the Unicode charset or 99.5% of UCS-2.

Actually, as far as I can see, these functions can't ever take a "string" as input unless they also take a parameter describing the encoding of it (that, or requiring a certain encoding - which should probably be UTF-8 for guaranteeing full functionality) because otherwise you simply can't know how to convert it properly. Most other functions take binary data and length and then the caller can decide to convert [correctly] to UTF-16BE before calling them.

A super-quick "workaround" would be to clearly document that the function only work correctly for ASCII and ISO-8859-1. But then you wouldn't be compatible with libs that do it right.

This bug is particularly bad in that if you, say, encrypt a certificate on a system using mbedTLS and then try to decrypt it on a system without this bug, your password won't work. Or vice versa. We are seeing examples of this.

magnumripper · 2019-12-22T10:18:54Z

(See below) ~~Please note that I haven't yet checked how OpenSSL handles this. I wouldn't be too surprised if they do the same, and in that case you are bug compatible with them.~~

The software we've seen so far that does the correct conversion happens to be running on Windows.

ciarmcom · 2019-12-22T13:25:15Z

Internal Jira reference: https://jira.arm.com/browse/IOTCRYPT-1032

magnumripper · 2019-12-22T18:33:36Z

Apparently OpenSSL doesn't have this bug if used correctly: It has functions like OPENSSL_asc2uni() which works just like your code but it also has OPENSSL_utf82uni() and as long as the latter is used, caller is bug free in this aspect. The UTF-8 functions were added in November 2015.

Edit:
https://mta.openssl.org/pipermail/openssl-dev/2016-August/008371.html

gilles-peskine-arm · 2020-01-30T19:40:54Z

Mbed TLS mostly focuses on embedded devices where code size is at a premium and non-ASCII character encodings are rarely used, and we try to avoid breaking backward compatibility. So I favor:

Documenting that pkcs12_pbe_derive_key_iv expects its argument pwd to be encoded in latin1.
Adding a new function that supports Unicode. I don't know what the argument type should be: wchar_t? (Would create a dependency on that type, which freestanding C implementations might not have.) uint16_t? (Would this be useful in practice?) And should it be documented as taking a BMPString per RFC 7292 or UTF-16 (allowing characters outside the basic plane)?

ciarmcom added type: bug Jira status: OPEN labels Dec 22, 2019

ciarmcom added the mirrored label Dec 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PKCS#12 PBE doesn't convert to UTF-16BE correctly #337

PKCS#12 PBE doesn't convert to UTF-16BE correctly #337

magnumripper commented Dec 22, 2019 •

edited

Loading

magnumripper commented Dec 22, 2019 •

edited

Loading

ciarmcom commented Dec 22, 2019

magnumripper commented Dec 22, 2019 •

edited

Loading

gilles-peskine-arm commented Jan 30, 2020

PKCS#12 PBE doesn't convert to UTF-16BE correctly #337

PKCS#12 PBE doesn't convert to UTF-16BE correctly #337

Comments

magnumripper commented Dec 22, 2019 • edited Loading

magnumripper commented Dec 22, 2019 • edited Loading

ciarmcom commented Dec 22, 2019

magnumripper commented Dec 22, 2019 • edited Loading

gilles-peskine-arm commented Jan 30, 2020

magnumripper commented Dec 22, 2019 •

edited

Loading

magnumripper commented Dec 22, 2019 •

edited

Loading

magnumripper commented Dec 22, 2019 •

edited

Loading