Skip to content

System.Text.Json: add ability to do semantic comparisons of JSON values à la JToken.DeepEquals() #33388

Closed
@dbc2

Description

@dbc2

API proposal

namespace System.Text.Json;

public partial struct JsonElement
{
    public static bool DeepEquals(JsonElement element1, JsonElement element2);
}

// Existing API
public partial class JsonNode
{
    public static bool DeepEquals(JsonNode? node1, JsonNode? node2);
}

Alternative Designs

The static method approach mirrors the shape used by JsonNode.DeepEquals, however in that case we are forced to use a static because null is a valid representation for the case of JsonNode (representing JSON null). This concern does not exist here so might as well just use an instance method instead.

In terms of implementation, property comparison always uses case sensitive ordinal comparison however we might consider introducing case insensitive comparison as well.

Original post Json.NET has the ability to do a deep semantic comparison of two JSON tokens via [`JToken.DeepEquals()`](https://www.newtonsoft.com/json/help/html/M_Newtonsoft_Json_Linq_JToken_DeepEquals.htm):

Compares the values of two tokens, including the values of all descendant tokens.

It also provides JTokenEqualityComparer for hashing and comparing of JSON tokens.

There does not appear to be an equivalent functionality in System.Text.Json for comparing two JsonElement or JsonDocument objects. As this is a common requirement (e.g. in writing unit tests, or checking for differences between object versions) it would be useful to have.

Sample Stack Overflow questions:

Issues:

  1. Formatting should be ignored.

  2. Differences due to string escaping probably should be ignored. (E.g. "a" should be equivalent to "\u0061".)

  3. Differences due to trailing zeroes probably should be significant. When deserializing to decimal trailing zeroes are preserved; financial, engineering and scientific apps sometimes make use of this.

    Example Stack Overflow questions:

    Utf8JsonReader preserves the underlying character representation when parsing a number so you have the advantage over JsonTextReader here, as the latter discards the character representation after recognizing a token as a number.

  4. Differences due to ordering of unique property names should be ignored since the original JSON proposal states, An object is an unordered set of name/value pairs.

  5. Differences due to ordering of duplicate property names require some thought.

    I have noticed that, surprisingly, JsonDocument fully supports duplicate property names! I.e. it's perfectly happy to parse {"Value":"a", "Value" : "b"} and will store both key/value pairs inside the document. (Thanks for that, I guess.)

    A close reading of https://tools.ietf.org/html/rfc8259#section-4 seems to indicate that such objects are allowed but not recommended, and when they arise, interpretation of identically-named properties may be order-dependent. So I'd propose that relative order of identically-named properties should be significant while relative order of distinctly named properties should not. Such a proposal could be implemented by stably sorting the properties by name, then comparing the sorted property lists in order.

  6. Differences due to casing of property names should be significant by default, but possibly there should be a configuration argument to ignore property name casing. (JToken.DeepEquals() doesn't support this.)

One possible demo comparer can be found here, but better performance may be possible by using internal methods.

Metadata

Metadata

Labels

api-approvedAPI was approved in API review, it can be implementedarea-System.Text.Jsonin-prThere is an active PR which will close this issue when it is merged

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions