-
Notifications
You must be signed in to change notification settings - Fork 494
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Internal] JSON Binary Encoding: Adds support for encoding uniform ar…
…rays (#4866) ## Description Added full end-to-end support for writing and reading binary-encoded uniform number arrays, as well as nested arrays of uniform number arrays. **Uniform Number Arrays** A uniform number array is a JSON array where all items share the same numeric type. The encoding supports the following numeric types: - **Int8**: Signed 1-byte integer (-128 to 127) - **UInt8**: Unsigned 1-byte integer (0 to 255) - **Int16**: Signed 2-byte integer - **Int32**: Signed 4-byte integer - **Int64**: Signed 8-byte integer - **Float16**: 2-byte floating-point value (currently unsupported) - **Float32**: 4-byte floating-point value - **Float64**: 8-byte floating-point value Uniform number arrays are represented by these new type markers: - **ArrNumC1**: Uniform number array with a 1-byte item count - **ArrNumC2**: Uniform number array with a 2-byte item count Both type markers are encoded as follows: `| Type marker | Item type marker | Item count |` To maintain backward compatibility, writing uniform number arrays is controlled via the `EnableNumberArrays `write option. When enabled, at the end of writing an array, the writer checks if all values are numeric. It identifies the smallest numeric type that fits all values and compares the length of the uniform number array to the regular array. If the new length is less than or equal to the old one, the array is converted to a uniform number array. **Arrays of Uniform Number Arrays** This encoding enhancement allows for encoding multiple uniform number arrays with the same underlying numeric type and item count into a single contiguous array of numbers. The items in all arrays are preceded by a prefix indicating the common array encoding and the number of encoded arrays. Arrays of uniform number arrays are supported by these two new type-markers: - **ArrArrNumC1C1**: Array of 1-byte item count of common uniform number arrays with 1-byte item count. - **ArrArrNumC2C2**: Array of 2-byte item count of common uniform number arrays with 2-byte item count. Both new values are encoded as follows: `| Type-marker | Array type-marker | Number type-marker | Number item count | Array item count |` Similar to uniform number arrays, the writing of arrays of uniform number arrays is conditional on the `EnableNumberArrays` write option being specified. This ensures backward compatibility with readers and navigators that do not yet support this encoding. **JSON Serialization Testing** - Introduced a new set of tests for both uniform number arrays and nested arrays of uniform number arrays. - Enhanced the `JsonToken` class to support representation of uniform number array tokens. - Updated `JsonWriterTest` to include additional validation. This now not only checks the expected output but also verifies round-trip consistency across different formats and write options for all three rewrite scenarios: JSON Navigator, JSON Reader - Write All, and JSON Reader - Write Current Token. ## Type of change Please delete options that are not relevant. - [ ] New feature (non-breaking change which adds functionality) ## Closing issues
- Loading branch information
Showing
22 changed files
with
9,550 additions
and
1,776 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.