Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVRO-3856: [C#] Hardcode MaxDepth property to 192 when using Newtonsoft JsonReader to load Avro schema up to 64 depth levels #2519

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

tradercentric
Copy link

@tradercentric tradercentric commented Sep 25, 2023

What is the purpose of the change

I am an end-user/developer to use Confluent Kafka to consume a Avro message where the number of nested structures is around 20. The producer/vendor cannot reduce the depth level of the Avro schema. Confluent support has assisted me to open an issue with Apache Avro project. The issue is logged in https://issues.apache.org/jira/browse/AVRO-3856. I am in need of code fix so I can continue.

The current code in Schema.cs parsing Avro schema using JObject.Parse(json), JArray.Parse(json) do not have option to override max depth level. On JamesNK/Newtonsoft.Json#2904, Newtonsoft Author advised to use JObject.Load(Jsonreader), so we can set the MaxDepth in JsonReader once instantiated.

In doing so, I also noticed the Avro schema depth level is not counted the same as in JsonReader's. The default depth level limit of 64 in JsonReader can only support around 20 levels of Avro schema.

When putting break point in JsonReader Push method (called by Load method) with various Avro schema depth levels, here is my observation:

Avro Schema Depth | JsonReader Depth Count
4 | 11
16 | 44
32 | 92
64 | 188

To compensate, I had hardcoded JsonReader MaxDepth to 192 to support parsing of Avro Schema with 64 level depth.

Verifying this change

This change added tests and can be verified as follows:

I added 3 test cases in SchemaTests.cs:

Parse16DepthLevelSchemaTest
Parse32DepthLevelSchemaTest
Parse64DepthLevelSchemaTest

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

…3x by setting JsonReader MaxDepth to 192. Also I took the Newtonsoft author advice to use JObject.Load(JsonReader) and JArray.Load(JsonReader) instead of their Parse method. So I can set the JsonReader MaxDepth.
@github-actions github-actions bot added the C# label Sep 25, 2023
@tradercentric tradercentric changed the title AVRO-3856: [C#] Fixing Newtonsoft usage in Schema.cs to support up to 64 level depth in Avro schema AVRO-3856: [C#] Fixing Newtonsoft usage in Schema.cs to parse up to 64 level depth in Avro schema Sep 25, 2023
Copy link
Contributor

@KalleOlaviNiemitalo KalleOlaviNiemitalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Avro.IO.JsonDecoder also need to be changed? I mean, if Avro supports a 64-level record schema, then it should also be possible to take construct an instance of such a record, encode it to JSON, and decode it back.

lang/csharp/src/apache/main/Schema/Schema.cs Outdated Show resolved Hide resolved
lang/csharp/src/apache/main/Schema/Schema.cs Outdated Show resolved Hide resolved
lang/csharp/src/apache/main/Schema/Schema.cs Outdated Show resolved Hide resolved
lang/csharp/src/apache/main/Schema/Schema.cs Outdated Show resolved Hide resolved
2. Followed JObject.Parse and JArray.Parse on reading the remaining json for error
3. Added a real world testcase of a schema where parsing failed.  The Avro schema is around JsonReader defaulted 64 max depth level
@tradercentric tradercentric changed the title AVRO-3856: [C#] Fixing Newtonsoft usage in Schema.cs to parse up to 64 level depth in Avro schema AVRO-3856: [C#] Avro schema depth limit i Sep 27, 2023
@tradercentric tradercentric changed the title AVRO-3856: [C#] Avro schema depth limit i AVRO-3856: [C#] Avro schema depth limit cap is around 20 levels Sep 27, 2023
@tradercentric tradercentric changed the title AVRO-3856: [C#] Avro schema depth limit cap is around 20 levels AVRO-3856: [C#] Avro schema nested structures limit is around 20 which is not enough in some cases Sep 27, 2023
@tradercentric
Copy link
Author

Added comment of a workaround on end user/developer side in https://issues.apache.org/jira/browse/AVRO-3856.

@KalleOlaviNiemitalo
Copy link
Contributor

In my opinion, the code changes here are now OK to merge, but AVRO-3856 should either

  • be left open until applications can actually customize the max depth, or
  • be reworded to request only a larger constant max depth rather than the ability to customize.

@emasab
Copy link

emasab commented Sep 28, 2023

Thanks @tradercentric @KalleOlaviNiemitalo I've updated the issue so it corresponds to the fix.

@tradercentric
Copy link
Author

I noticed there is no deserialization error in Java Avro library with the same schema.
May I ask whose Confluent developers qualified in improving Apache.Avro C# library can lead the effort to allow customization of max depth please?

@tradercentric tradercentric changed the title AVRO-3856: [C#] Avro schema nested structures limit is around 20 which is not enough in some cases AVRO-3856: [C#] Hardcode max depth to 192 when using Newtonsoft Jsonreader to parse Avro schema up to 64 depth level Sep 28, 2023
@tradercentric
Copy link
Author

tradercentric commented Sep 28, 2023

Hi @KalleOlaviNiemitalo, I have reworded the title to reflect the wish to hardcode value of 192 to support parsing of Avro schema up to 64 depth levels. May I ask if this is okay to merge for this tactical fix please?

@tradercentric tradercentric changed the title AVRO-3856: [C#] Hardcode max depth to 192 when using Newtonsoft Jsonreader to parse Avro schema up to 64 depth level AVRO-3856: [C#] Hardcode max depth to 192 when using Newtonsoft JsonReader to parse Avro schema up to 64 depth level Sep 28, 2023
@tradercentric tradercentric changed the title AVRO-3856: [C#] Hardcode max depth to 192 when using Newtonsoft JsonReader to parse Avro schema up to 64 depth level AVRO-3856: [C#] Hardcode max depth to 192 when using Newtonsoft JsonReader to parse Avro schema up to 64 depth levels Sep 28, 2023
@tradercentric tradercentric changed the title AVRO-3856: [C#] Hardcode max depth to 192 when using Newtonsoft JsonReader to parse Avro schema up to 64 depth levels AVRO-3856: [C#] Hardcode max depth to 192 when using Newtonsoft JsonReader to load Avro schema up to 64 depth levels Sep 28, 2023
@tradercentric tradercentric changed the title AVRO-3856: [C#] Hardcode max depth to 192 when using Newtonsoft JsonReader to load Avro schema up to 64 depth levels AVRO-3856: [C#] Hardcode MaxDepth property to 192 when using Newtonsoft JsonReader to load Avro schema up to 64 depth levels Sep 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants