Skip to content

Incorrect parsing of indefinite length CBOR strings. #961

Closed
@johnfb

Description

I found this library and thought it was very interesting and like a lot about it and decided to brows the source code a little bit. I am thinking I will try using this in my next project. In my browsing I found the following problem:

case 0x7F: // UTF-8 string (indefinite length)

According to section 2.2.2 of the CBOR RFC 7049 an indefinite length string is a list of definite length strings terminated by the terminal byte 0xFF. As such the following code snippet should print the same string twice.

    using json = nlohmann::json;
    std::vector<uint8_t> v_cbor = {
        0x7F,
            0x64,
                'a', 'b', 'c', 'd',
            0x63,
                '1', '2', '3',
            0xFF
    };
    json j = json::from_cbor(v_cbor);
    std::cout << "\"abcd123\"\n";
    std::cout <<  j << std::endl;

But as of version 3.1.0 it prints:

"abcd123"
"dabcdc123"

Changing that case to something like:

            case 0x7F: // UTF-8 string (indefinite length)
            {
                std::vector<string_t> result_list;
                std::size_t size = 0;
                while (get() != 0xFF)
                {
                    unexpect_eof();
                    result_list.push_back(get_cbor_string());
                    size += result_list.back().size();
                }
                string_t result;
                result.reserve(size);
                for (auto r: result_list) { result.append(r); }
                return result;
            }

Fixes it. I'm not sure if this is the way you want to do it though.

Metadata

Assignees

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions