Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow deserialization when properties case doesn't match #251

Merged
merged 22 commits into from
Oct 20, 2022
Merged

Allow deserialization when properties case doesn't match #251

merged 22 commits into from
Oct 20, 2022

Conversation

torbacz
Copy link
Contributor

@torbacz torbacz commented Oct 3, 2022

Description

  • Rework setting values - replaced with IMemberResolver
  • Two implementation for IMemberResolver - CaseSensitive (old one, default), CaseInsensitive
  • Implemented few more converters and fixed others
  • Created new static class "Settings" for easier managing JSON serialization. Currently, Only switch for CaseSensitive
  • New unit tests to cover new areas

NOTE: Benchmarks without IntArray and ShortArray due to possible bug.

Benchmark name Method name Current master New New vs Master
ReferenceTypesSerializationBenchmark IntArray 146 140 -6
  ShortArray 142 134 -8
  String 64 62 -2
  NestedClass 160 146 -14
  ComplexObject 252 230 -22
  ComplexArrayObject 440 412 -28
  ArrayList 460 412 -48
ReferenceTypesDeserializationBenchmark IntArray 732 0 -732
  ArrayList 1192 1160 -32
  ComplexObjectAzureTwinPayload 572 584 12
  ShortArray 602 0 -602
  String 180 156 -24
  NestedClass 196 242 46
  ComplexArrayObject 734 890 156
ValueTypesDeserializationBenchmark Short 208 210 2
  TimeSpanT 348 350 2
  Float 224 224 0
  Double 238 234 -4
  DateTimeT 304 310 6
  Long 238 236 -2
  EnumBoxed 318 318 0
ValueTypesSerializationBenchmark Short 124 130 6
  TimeSpanT 138 148 10
  Float 176 174 -2
  Double 164 168 4
  DateTimeT 1562 1300 -262
  Long 130 136 6
  Enum 150 158 8
  EnumBoxed 650 590 -60

Motivation and Context

How Has This Been Tested?

Screenshots

Types of changes

  • Improvement (non-breaking change that improves a feature, code or algorithm)
  • Bug fix (non-breaking change which fixes an issue with code or algorithm)
  • New feature (non-breaking change which adds functionality to code)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Config and build (change in the configuration and build system, has no impact on code or features)
  • Dependencies (update dependencies and changes associated, has no impact on code or features)
  • Unit Tests (add new Unit Test(s) or improved existing one(s), has no impact on code or features)
  • Documentation (changes or updates in the documentation, has no impact on code or features)

Checklist:

  • My code follows the code style of this project (only if there are changes in source code).
  • My changes require an update to the documentation (there are changes that require the docs website to be updated).
  • I have updated the documentation accordingly (the changes require an update on the docs in this repo).
  • I have read the CONTRIBUTING document.
  • I have tested everything locally and all new and existing tests passed (only if there are changes in source code).
  • I have added new tests to cover my changes.

@nfbot nfbot added the Type: Bug Something isn't working label Oct 3, 2022
@torbacz torbacz changed the title Cleanup Allow serialization when properties case doesn't match. Oct 3, 2022
@nfbot nfbot changed the title Allow serialization when properties case doesn't match. Allow serialization when properties case doesn't match Oct 3, 2022
@torbacz torbacz changed the title Allow serialization when properties case doesn't match Allow deserialization when properties case doesn't match Oct 3, 2022
@torbacz
Copy link
Contributor Author

torbacz commented Oct 4, 2022

Each benchmark was 2500 times resolving the same property, each test class contained 5 fields/properties and the last one was the one which was resolved.

[IterationCount(10)]
public class ResolverFieldBenchmark
{
    private class ClassWithField
    {
        public string TestField1;
        public string TestField2;
        public string TestField3;
        public string TestField4;
        public string TestField5;
    }

    private CaseSensitiveResolver caseSensitiveResolver;
    private CaseInsensitiveResolver caseInsensitiveResolver;
    private Type typeToCheck;
    private const short ResolvingCount = 2500;

    [Setup]
    public void SetUp()
    {
        caseSensitiveResolver = new CaseSensitiveResolver();
        caseInsensitiveResolver = new CaseInsensitiveResolver();
        typeToCheck = typeof(ClassWithField);
    }

    [Benchmark]
    [Baseline]
    public void CaseSensitive_FieldResolve()
    {
        for (int i = 0; i < ResolvingCount; i++)
        {
            var field = caseSensitiveResolver.GetResolver(nameof(ClassWithField.TestField5), typeToCheck);
        }
    }

    [Benchmark]
    public void CaseInsensitive_FieldResolve()
    {
        for (int i = 0; i < ResolvingCount; i++)
        {
            var field = caseInsensitiveResolver.GetResolver(nameof(ClassWithField.TestField5), typeToCheck);
        }
    }
}

[IterationCount(10)]
public class ResolverPropertyBenchmark
{
    private class ClassWithProperty
    {
        public string TestField1 { get; set; }
        public string TestField2 { get; set; }
        public string TestField3 { get; set; }
        public string TestField4 { get; set; }
        public string TestField5 { get; set; }
    }

    private CaseSensitiveResolver caseSensitiveResolver;
    private CaseInsensitiveResolver caseInsensitiveResolver;
    private Type typeToCheck;
    private const short ResolvingCount = 2500;

    [Setup]
    public void SetUp()
    {
        caseSensitiveResolver = new CaseSensitiveResolver();
        caseInsensitiveResolver = new CaseInsensitiveResolver();
        typeToCheck = typeof(ClassWithProperty);
    }

    [Benchmark]
    [Baseline]
    public void CaseSensitive_PropertyResolve()
    {
        for (int i = 0; i < ResolvingCount; i++)
        {
            var field = caseSensitiveResolver.GetResolver(nameof(ClassWithProperty.TestField5), typeToCheck);
        }
    }

    [Benchmark]
    public void CaseInensitive_PropertyResolve()
    {
        for (int i = 0; i < ResolvingCount; i++)
        {
            var field = caseInsensitiveResolver.GetResolver(nameof(ClassWithProperty.TestField5), typeToCheck);
        }
    }
}
Console export: ResolverPropertyBenchmark benchmark class.
| MethodName                     | ItterationCount | Mean     | Ratio  | Min       | Max       |
| -------------------------------------------------------------------------------------------- |
| CaseSensitive_PropertyResolve  | 10              | 2891 ms  | 1.0    | 2,860 ms  | 2,920 ms  |
| CaseInensitive_PropertyResolve | 10              | 16589 ms | 5.7381 | 16,550 ms | 16,630 ms |


Console export: ResolverFieldBenchmark benchmark class.
| MethodName                   | ItterationCount | Mean    | Ratio  | Min      | Max      |
| --------------------------------------------------------------------------------------- |
| CaseSensitive_FieldResolve   | 10              | 1911 ms | 1.0    | 1,880 ms | 1,950 ms |
| CaseInsensitive_FieldResolve | 10              | 5825 ms | 3.0481 | 5,780 ms | 5,860 ms |

Like i thought, current implementation for insensitive members resolving is much slower. It's none default one, so it has no impact on standard mode.

@nfbot nfbot added the Type: documentation Improvements or additions to documentation label Oct 4, 2022
@torbacz torbacz marked this pull request as ready for review October 4, 2022 18:13
@Ellerbach
Copy link
Member

Like i thought, current implementation for insensitive members resolving is much slower. It's none default one, so it has no impact on standard mode.

That was indeed expected :-) Bad can we make a simple test to find a property and if nothing found then try the insensitive case?

So we would go the normal way and deserialize what can be. Once done, just checking is anything is left from both the json token and the class. If something left, let's try the insensitive case.

We'll still loose few cycles because of the minimum check of what has been deserialized and what's left but that's it for the main part. And then we won't lose as much on what's left because we know what's left!

What do you think?

@torbacz
Copy link
Contributor Author

torbacz commented Oct 4, 2022

It would mean:

  • Double iterate in worst case scenario
  • We need to track/remove already handled objects - memory impact when tracking or performance when removing. Also we don't want to remove objects from collection we are iterating through. It may lead to very nasty bugs.
  • Then we need to call the same code, but with case insensitive resolver
    I'm a little bit worried about about performance and memory usage.

I would rather go with similar approach as current, but instead of having two "Resolvers" just merge them into one. In short, if the "case sensitive" can't find member, then call "case insensitive". If "case insensitive" can't find member - then throw exception. It would also mean removing few lines of code and classes.
Any thoughts?

@josesimoes
Copy link
Member

I'm not seeing an easy alternative to tackle this. Any of the approaches that are possible with what we have available will incurr in a performance penalty.

Questions:

  1. Do we really need to add support for case insensitive deserialization? I mean, this is nano, we make compromises every day and with most APIs. This is one of those.
  2. Can look into adding support for case insensitive in the CLR. Then we would just have to add the case sensitive property to configure this, like with the full framework. Is it worth it? This hasn't been implemented since NETMF time. Probably for a good reason.

@Ellerbach
Copy link
Member

2. Can look into adding support for case insensitive in the CLR. Then we would just have to add the case sensitive property to configure this, like with the full framework. Is it worth it? This hasn't been implemented since NETMF time. Probably for a good reason.

Yes, we can definitely add an option for that. That will allow to add options for other scenarios like in the larger framework. And we then should be very very very clear on the performance impact. Still if we add the option, I do think we should try to minimize the perf impact. But 100% agree, there will be one!

@torbacz
Copy link
Contributor Author

torbacz commented Oct 5, 2022

@josesimoes
Well, in only nano we can work with case sensitive deserialization. But the problem is when we start integrating nano devices with .NET API which by default serialize into JSON with lowercase first letter in json property name. So the property "TestProperty" becomes "testProperty". Of course we can change configuration for .NET API, but it would require to change it in every new API. The use case for such mechanism is when we want to configure nano device from .NET application (colleague of mine implemented such UI with Blazor and nanoFramework. Announcement, soon 😄)

One more proposition from me.
https://github.com/nanoframework/nanoFramework.Json/blob/9ff4e29e4b9dbec7903618d3e0b64434c1f1925e/nanoFramework.Json/Resolvers/CaseSensitiveResolver.cs

If we can't find property with match case, we are calling case insensitive methods. If case insensitive fails, then throw exception.

@Ellerbach
Copy link
Member

If we can't find property with match case, we are calling case insensitive methods. If case insensitive fails, then throw exception.

We should not throw an exception, a property can be absent and kept with default value or null.

I'm fine with the following logic:

  • Add an option to have the possible check for case insensitive
  • Add an option to raise an exception if the property is not found
  • By default, case sensitive (like today), no exception raise (like today)
  • case insensitive and exception raised can be used together or separately

I would still love to see the impact on the performance, but with those options, I'm quite sure it's very minimum and very acceptable.

colleague of mine implemented such UI with Blazor and nanoFramework. Announcement, soon

Can't wait :-D

@josesimoes
Copy link
Member

I'll look into adding support for case sensitive comparison in mscorlib. Just can't promise it for tomorrow...

@torbacz
Copy link
Contributor Author

torbacz commented Oct 6, 2022

@Ellerbach
I've just pushed new files. Please check the tests for resolver, they describe all resolver logic.
https://github.com/torbacz/nanoFramework.Json/tree/cleanup/nanoFramework.Json.Test/Resolvers

@josesimoes
Take your time, for now we can just go with "ToLower" and update it later.

@Ellerbach
Copy link
Member

I've just pushed new files. Please check the tests for resolver, they describe all resolver logic.
https://github.com/torbacz/nanoFramework.Json/tree/cleanup/nanoFramework.Json.Test/Resolvers

Thanks. I have to say I like the logic. And for the settings, the throw if member does not exist should be false by default here https://github.com/torbacz/nanoFramework.Json/blob/cc521313b5869e784916ed4d3a827d69a6e37598/nanoFramework.Json/Configuration/Settings.cs#L18

So this will still involve a little bit of performance issue compare to the "past" use case where we had nothing at all. But with the recent improvements, we can for sure lose a little bit to be able to get this very nice addition.

What I also like is the possibility to have your own resolver. That can be very handy to map some enums or strings or whatever.

@torbacz
Copy link
Contributor Author

torbacz commented Oct 7, 2022

New benchmark results
image

Legend:

  • Green -> better
  • Red -> worse
  • Yellow -> no change/no benchmark on new code

I'm a little bit worried about about deserialization for "NestedClasss" and "ComplexArrayObject". I want to investigate it before merging.

Edit: Also, explicitly pointed out ThrowExceptionWhenPropertyNotFound is false by default.

@josesimoes
Copy link
Member

@torbacz this is looking good! I share your worries. Those are heavily used cases for json payloads...

@torbacz
Copy link
Contributor Author

torbacz commented Oct 12, 2022

@josesimoes I found it, the "problem" is in StringConverter, when I've changed it to "just return input object" the performance was pretty much the same.
image

But it needs to go through all the logic to escape all required characters in string object.

In my option we should:

  1. Create several issues based on findings:
  • Improve performance of StringConverter.ToType
  • Remove JsonValue DateTime conversion and change logic in DateTime converter (currently it's returning input object without conversion)
  • Try to remove conversion from private static JsonToken ParseValue(ref LexToken token) method (should be handled by converters)
  • Maybe something else?
  1. Merge this PR

Then someone (probably I'll take it) can work on further cleanup/performance in JSON.

@josesimoes
Copy link
Member

@torbacz nice finding!! 👏🏻 persistence pays off!! 😉

There's a project for the json library.

Please add any tasks and to-do's there. We can move those to issues in order to track progress as needed.
We shouldn't clutter the issue list with todo's and planned stuff, otherwise is becomes messy and hard to read, specially from outsiders looking at the project.

I would suggest that we discuss this on Discord. It's a more suitable and streamlined experience. (we can add the URL for the thread here for reference)

@torbacz
Copy link
Contributor Author

torbacz commented Oct 19, 2022

@josesimoes
Finally, I've got some time to finish this one.
I've added tasks in the project page. We can discus them later on discord.

Are we ready to merge this one?

Copy link
Member

@josesimoes josesimoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@josesimoes
Copy link
Member

@torbacz I'm good with these changes. OK for discussing on Discord. 👍🏻

@Ellerbach any other take on this?

@Ellerbach
Copy link
Member

@torbacz I'm good with these changes. OK for discussing on Discord. 👍🏻

@Ellerbach Laurent Ellerbach FTE any other take on this?

I'm good! Let's merge it!

@josesimoes josesimoes merged commit f727302 into nanoframework:main Oct 20, 2022
@torbacz torbacz deleted the cleanup branch October 20, 2022 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hacktoberfest-accepted Type: Bug Something isn't working Type: documentation Improvements or additions to documentation Type: enhancement Type: Unit Tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to deserialize property when property name case doesn't match
4 participants