Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String type change breaks C++ type matching #2649

Open
1 of 3 tasks
sirzooro opened this issue Feb 18, 2021 · 17 comments
Open
1 of 3 tasks

String type change breaks C++ type matching #2649

sirzooro opened this issue Feb 18, 2021 · 17 comments
Labels
kind: bug state: help needed the issue needs help to proceed

Comments

@sirzooro
Copy link

Usually when one of template parameters is changed, reference to new type cannot be implicitly converted to reference to old type. However for some reason this is not true for nlohmann::json. I tried to change std::string to my custom string class, and it turned out that references to both types are compatible for compiler. Here is example code which demonstrates this problem. It uses std::wstring instead of std::string:

#include <json.hpp>

template <
    template <typename U, typename V, typename... Args> class ObjectType = std::map,
    template <typename U, typename... Args> class ArrayType              = std::vector,
    class StringType = std::wstring, class BooleanType = bool,
    class NumberIntegerType = std::int64_t, class NumberUnsignedType = std::uint64_t,
    class NumberFloatType = double, template <typename U> class AllocatorType = std::allocator,
    template <typename T, typename SFINAE = void> class JSONSerializer = nlohmann::adl_serializer>
using MyBasicJson = nlohmann::basic_json<
    ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType,
    NumberFloatType, AllocatorType, JSONSerializer>;

using MyJson = MyBasicJson<>;

void load(const nlohmann::json& json);
//void load(const MyJson& json);

void test(MyJson* json)
{
    load(*json);
}

During compilation I should get error that parameter passed to load() has incorrect type. However this code compiles cleanly. I tried to use gcc 10.2, clang 10.0, both compiles this code without any complain. I use nlohmann::json 3.9.1. I compiled it on CentOS and Ubuntu using following command:

g++ -c -o dupa.o test.cpp -O3 -Wall -Wextra -Werror -std=c++11 -I.

I played a bit with this code trying to create minimum example. I found that when removed include and copied forward declarations from json.hpp to my file, g++ reported error as expected. So it looks that something is wrong in other part of json.hpp.

Initially I thought that this may be some gcc error, but clang also does not complain, so this is unlikely.

Which compiler and operating system are you using?

gcc 10.2 on CentOS7 (installed from RedHat SCL packages), clang 10.0 on Ubuntu Ubuntu 20.04.2 LTS

Which version of the library did you use?

  • latest release version 3.9.1
  • other release - please state the version: ___
  • the develop branch
@gregmarr
Copy link
Contributor

Discussed recently at #2628 (reply in thread)

@sirzooro
Copy link
Author

One more thing to consider: in my case my custom string class cannot be implicitly converted to std::string. I only have opposite conversion available, from std::string to my string class. When parameter is passed to function as a const reference, compiler may create temporary object as a result of implicit type conversion. However this should not happen in my case, as implicit conversion type conversion is not possible. At some point compiler should start complaining about incorrect types.

@agauniyal
Copy link

This has caused numerous issues for my case as well. I use same std::string but with a custom allocator and then override nlohmann::json class by passing custom std::string and keeping rest containers same. Doing so causes runtime crashes wherever strings are mismatched, we fix it by calling .c_str() method.

@agauniyal
Copy link

@nlohmann we define JSON_USE_IMPLICIT_CONVERSIONS as 0 and still lots of such mistakes go unnoticed. These now amount to more than half the crashes in our library. Could you help out with a patch or small code change that'd enable us catch all such ill-formed conversions?

@nlohmann
Copy link
Owner

@nlohmann we define JSON_USE_IMPLICIT_CONVERSIONS as 0 and still lots of such mistakes go unnoticed. These now amount to more than half the crashes in our library. Could you help out with a patch or small code change that'd enable us catch all such ill-formed conversions?

Can you provide examples for this?

@agauniyal
Copy link

@nlohmann

We've a custom secure_allocator for certain purposes and have to use it throughout the application. The code that makes custom string and json looks like follows -

template<typename T> using secure_basic_string = std::basic_string<T,std::char_traits<T>,secure_allocator<T>>;
using secure_string = secure_basic_string<char>;
using secure_json = nlohmann::basic_json<std::map, std::vector, secure_string>;

Now any conversion between std::string and secure_string is to be done via .c_str() methods only, otherwise compiler gives error - 'No viable conversion between ...' -

std::string a = "Hello";
secure_string b = a;  // compile error
secure_string c = a.c_str();  // compiles
std::string d = c;  // compile error
std::string e = c.str();  // compiles

But if I interchangly use secure_json, std::string and json and secure_string, also json and secure_json, then there are no compile time errors, instead the program crashes and sometimes when it doesn't, the data is complete garbage -

secure_string ss = "ss";
std::string s = "s";

json j1;
j1["k1"] = s;  // works fine
j1["k2"] = ss;  // no compile error, 99% time crashes, otherwise contains garbage
j1["k3"] = ss.c_str(); // works fine

secure_json j2;
j2["k1"] = ss;  // works fine
j2["k2"] = s;  // no compile error, 99% time crashes, otherwise contains garbage
j2["k3"] = s.c_str(); // works fine


std::string d1 = j1.dump();  // works fine
std::string d2 = j2.dump();   // no compile error, 99% time crashes, otherwise contains garbage
std::string d3 = j2.dump().c_str();  // works fine

secure_string d4 = j2.dump();  // works fine
secure_string d5 = j1.dump();  // no compile error, 99% time crashes, otherwise contains garbage
secure_string d6 = j1.dump().c_str();  // works fine


json j1 = json::parse(s);  // works fine
json j2 = json::parse(ss);  // no compile error, 99% time crashes, otherwise contains garbage
json j3 = json::parse(ss.c_str());  // works fine

secure_json j1 = secure_json::parse(ss);  // works fine
secure_json j2 = secure_json::parse(s);  // no compile error, 99% time crashes, otherwise contains garbage
secure_json j3 = secure_json::parse(s.c_str());  // works fine

@stale
Copy link

stale bot commented Jan 9, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated label Jan 9, 2022
@sirzooro
Copy link
Author

sirzooro commented Jan 9, 2022

This still needs a fix.

@stale stale bot removed the state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated label Jan 9, 2022
@agauniyal
Copy link

ping @nlohmann

@nlohmann
Copy link
Owner

I am aware of the issue, but have not found the time to look into it.

@nlohmann nlohmann added the state: help needed the issue needs help to proceed label Jan 13, 2022
@kaptaanalt
Copy link

We're facing the same issue in our org, can this be solved please?

@falbrechtskirchinger
Copy link
Contributor

This might be related to #3425. I'll check if any of the provided examples trigger the assertions I've added to at least catch some problematic conversions between basic_json types.

And as an aside, std::wstring as StringType is not well supported. There're too many places where it's assumed that the underlying char type is 8bit wide. Parsing std::wstring on the other hand is well defined.

@falbrechtskirchinger
Copy link
Contributor

falbrechtskirchinger commented May 29, 2022

Ok. The people who commented on this matter seem to be experiencing related but different issues.

@sirzooro What you reported is working as designed. I agree, though, that the conversion between basic_json types should at least be explicit and trigger a compile error for your example.

@agauniyal Can you provide a minimal, self-contained example of a custom allocator that reproduces your specific problems? I have a suspicion that it's not entirely the library's fault. Also, assuming I'm inferring the purpose of secure_string correctly, I question the soundness of this approach as not all strings are allocated on the heap. Specifically, short strings may (will, in most implementations) be stored inside the string object instead of memory allocated by secure_allocator. I need to know more about secure_allocator to opine further.

@kaptaanalt Can you provide more details so I can properly assess your problem?

To summarize: The implicit conversion between different basic_json types is intentional but doesn't always work as expected. One of the worst-case outcomes is a different basic_json::type() after conversion (see #3425, where a string is converted into an array).

In any case, all these issues can currently be worked around by implementing a to_json function for the types involved and either implementing a proper conversion or throwing an exception, or aborting/terminating the process.

@agauniyal
Copy link

I'll try if can get you a mve for secure_allocator. Your understanding about it's secure nature is correct as well, I just want to add that it disables SSO by manually allocating bigger strings even in the case of small ones, such that nothing is stored on the stack.

@sirzooro
Copy link
Author

@falbrechtskirchinger my custom string class is something like @agauniyal's secure_string. I used std::wstring only as an example, to avoid pasting code of my string class here. It is something like below. With this approach I can securely allocate and clear memory used by std::string object too:

class secure_string
{
private:
  std::string<char, std::char_traits<char>, secure_allocator>* str;
public:
  // public interface which mimics std::string and (mostly) is a thin wrapper around `str`, e.g.
  size_t size() const { return str->size(); }
};

@falbrechtskirchinger
Copy link
Contributor

The workaround I mentioned before is still the only way to solve this I could think of so far. Any other solutions I considered are breaking changes and would have to wait until 4.0.

The caveat is, that even my workaround requires a minor modification of the library and I haven't done any significant testing to ensure it doesn't break anything else.

Here's the code to disable conversions between basic_json types with mismatched string types:
(alt_string is the alternative string type from the library unit test, it's a thin wrapper around std::string. Replace that with your string type.)

namespace nlohmann {
template<>
struct adl_serializer<alt_string> {
template<typename BasicJsonType, std::enable_if_t<std::is_same_v<typename BasicJsonType::string_t, alt_string>, int> = 0>
static void from_json(BasicJsonType&& j, alt_string& s) {
    assert(j.is_string());
    s = j.template get_ref<const alt_string &>();
}

template<typename BasicJsonType, std::enable_if_t<std::is_same_v<typename BasicJsonType::string_t, alt_string>, int> = 0>
static void to_json(BasicJsonType& j, const alt_string& s) {
    ::nlohmann::to_json(j, s);
}
};

template<>
struct adl_serializer<std::string> {
template<typename BasicJsonType, std::enable_if_t<std::is_same_v<typename BasicJsonType::string_t, std::string>, int> = 0>
static void from_json(BasicJsonType&& j, std::string& s) {
    assert(j.is_string());
    s = j.template get_ref<const std::string &>();
}

template<typename BasicJsonType, std::enable_if_t<std::is_same_v<typename BasicJsonType::string_t, std::string>, int> = 0>
static void to_json(BasicJsonType& j, const std::string& s) {
    ::nlohmann::to_json(j, s);
}
};
}  // namespace nlohmann

This works by SFINAE-disabling the from_json/to_json functions for mismatched string types.

In addition, the following change to include/nlohmann/json.hpp is required for the above code to compile.

  1. Find this member function of basic_json:
    /// @brief create a JSON value from compatible types
    /// @sa https://json.nlohmann.me/api/basic_json/basic_json/
    template < typename CompatibleType,
               typename U = detail::uncvref_t<CompatibleType>,
               detail::enable_if_t <
                   !detail::is_basic_json<U>::value && detail::is_compatible_type<basic_json_t, U>::value, int > = 0 >
    basic_json(CompatibleType && val) noexcept(noexcept( // NOLINT(bugprone-forwarding-reference-overload,bugprone-exception-escape)
                JSONSerializer<U>::to_json(std::declval<basic_json_t&>(),
                                           std::forward<CompatibleType>(val))))
    {
        JSONSerializer<U>::to_json(*this, std::forward<CompatibleType>(val));
        set_parents();
        assert_invariant();
    }
  2. Replace the template parameter U in detail::is_compatible_type<basic_json_t, U>::value with CompatibleType.

Again, I haven't done any further testing. Once I confirm that this modification doesn't break anything else, I'll submit a PR.

@falbrechtskirchinger
Copy link
Contributor

The patch to the CompatibleType constructor from my previous post breaks a lot of things and PR #3518 only covers the basic_json converting constructor.

Other than adding JSON_EXPLICIT to the CompatibleType constructor (which is already a breaking change), I'm out of ideas that are in scope of a 3.x release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: bug state: help needed the issue needs help to proceed
Projects
None yet
Development

No branches or pull requests

6 participants