-
Notifications
You must be signed in to change notification settings - Fork 667
Add CBOR ignoreUnknownKeys option #947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cb5e48a to
a4a477f
Compare
a4a477f to
e62191e
Compare
formats/cbor/commonTest/src/kotlinx/serialization/cbor/CborReaderTest.kt
Outdated
Show resolved
Hide resolved
Test case was covered in: `testDecodeCborWithUnknownKeysInSealedClasses`
formats/cbor/commonTest/src/kotlinx/serialization/cbor/CborReaderTest.kt
Outdated
Show resolved
Hide resolved
This comment has been minimized.
This comment has been minimized.
formats/cbor/commonTest/src/kotlinx/serialization/cbor/CborReaderTest.kt
Outdated
Show resolved
Hide resolved
|
Small update: we are busy with releasing Kotlin 1.4.0 and release candidate of serialization right now. |
6f42ad1 to
84b99e4
Compare
This comment has been minimized.
This comment has been minimized.
|
@twyatt Sorry for not keeping in touch. I see you've rebased the branch, this should be enough of help. I'll try to return to you soon; I think this is an important feature for 1.0, so we'll try to put it there |
sandwwraith
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR and explanations — they were really helpful. I'd left some comments, mainly about exceptions, but the rest of PR is fine.
formats/cbor/commonMain/src/kotlinx/serialization/cbor/internal/Encoding.kt
Show resolved
Hide resolved
| } | ||
|
|
||
| fun skipElement() { | ||
| val lengthStack = mutableListOf<Int>() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a little concern that MutableList<Int> may cause performance problems due to boxing, but this may be just a premature optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a multiplatform-friendly dynamically sizing stack I could use in its place? Or would I have to roll my own?
If needed I could try to build a basic stack implementation to avoid the auto-boxing but I'm not confident my implementation would be optimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, there's an IntArrayBuilder for deserializing IntArrays, but it is hidden for now. I think we can go with MutableList until we have a clear benchmarks that show if this is a real problem. @qwwdfsad WDYT?
formats/cbor/commonMain/src/kotlinx/serialization/cbor/internal/Encoding.kt
Outdated
Show resolved
Hide resolved
formats/cbor/commonMain/src/kotlinx/serialization/cbor/internal/Encoding.kt
Outdated
Show resolved
Hide resolved
formats/cbor/commonMain/src/kotlinx/serialization/cbor/internal/Encoding.kt
Outdated
Show resolved
Hide resolved
formats/cbor/commonMain/src/kotlinx/serialization/cbor/internal/Encoding.kt
Outdated
Show resolved
Hide resolved
formats/cbor/commonTest/src/kotlinx/serialization/cbor/CborReaderTest.kt
Outdated
Show resolved
Hide resolved
formats/cbor/commonTest/src/kotlinx/serialization/cbor/CborReaderTest.kt
Outdated
Show resolved
Hide resolved
|
The build check has failed — seems you have to run |
This comment has been minimized.
This comment has been minimized.
formats/cbor/commonMain/src/kotlinx/serialization/cbor/internal/Streams.kt
Outdated
Show resolved
Hide resolved
| val index = if (cbor.ignoreUnknownKeys) { | ||
| val knownIndex: Int | ||
| while (true) { | ||
| if (isDone()) return CompositeDecoder.DECODE_DONE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check could be lifted to the very first line of the method to simplify the control flow and make it slightly easier to read
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately decoder.skipElement() (line 248) will skip the current value element and we'll need to check isDone() on the next loop to keep the previous short-circuit behavior (when beginning to read the next element value).
I tried lifting isDone() line (just to be sure) and it does in fact cause a number of tests to fail.
Would it be better to remove some of the code that is duplicated in the top-most if branches by having it always enter the while loop?:
override fun decodeElementIndex(descriptor: SerialDescriptor): Int {
val index: Int
while (true) {
if (isDone()) return CompositeDecoder.DECODE_DONE
val elemName = decoder.nextString()
readProperties++
if (cbor.ignoreUnknownKeys) {
val elementIndex = descriptor.getElementIndex(elemName)
if (elementIndex == CompositeDecoder.UNKNOWN_NAME) {
decoder.skipElement()
} else {
index = elementIndex
break
}
} else {
index = descriptor.getElementIndexOrThrow(elemName)
break
}
}
decodeByteArrayAsByteString = descriptor.isByteString(index)
return index
}I could additionally create a SerialDescriptor.getElementIndexOrSkip(String) extension function, but it would need access to the decoder so could not live next to the existing getElementIndexOrThrow extension function (which is top-level)?
I agree that the control flow is a bit difficult to follow in this function, so any other suggestions/strategies to refactor and simplify it are appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather stick with the current version — in the simple case (when ignoreUnknownKeys=false) it does not bother you with infinite loop. I think we can live with 3 lines of code duplicated.
formats/cbor/commonMain/src/kotlinx/serialization/cbor/internal/Encoding.kt
Show resolved
Hide resolved
Co-authored-by: Leonid Startsev <sandwwraith@users.noreply.github.com>
Co-authored-by: Leonid Startsev <sandwwraith@users.noreply.github.com>
Co-authored-by: Leonid Startsev <sandwwraith@users.noreply.github.com>
`elementLength` returns length for specific headers, none of which would result in `readNumber` providing a negative number (length).
twyatt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I've addressed or responded to all the comments/feedback.
Additionally, added tests for non-happy-paths and executed/committed ./gradlew knit.
formats/cbor/commonMain/src/kotlinx/serialization/cbor/internal/Encoding.kt
Show resolved
Hide resolved
formats/cbor/commonTest/src/kotlinx/serialization/cbor/CborReaderTest.kt
Outdated
Show resolved
Hide resolved
| val index = if (cbor.ignoreUnknownKeys) { | ||
| val knownIndex: Int | ||
| while (true) { | ||
| if (isDone()) return CompositeDecoder.DECODE_DONE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather stick with the current version — in the simple case (when ignoreUnknownKeys=false) it does not bother you with infinite loop. I think we can live with 3 lines of code duplicated.
formats/cbor/commonMain/src/kotlinx/serialization/cbor/internal/Encoding.kt
Outdated
Show resolved
Hide resolved
formats/cbor/commonMain/src/kotlinx/serialization/cbor/internal/Encoding.kt
Outdated
Show resolved
Hide resolved
formats/cbor/commonMain/src/kotlinx/serialization/cbor/internal/Encoding.kt
Outdated
Show resolved
Hide resolved
formats/cbor/commonTest/src/kotlinx/serialization/cbor/CborReaderTest.kt
Outdated
Show resolved
Hide resolved
|
Pulled in the latest I get the same failure when attempting the same command on |
|
@twyatt What version of JDK do you use? Apparently this method is available only since Java 11. |
sandwwraith
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran the check for you with JDK 11 — it passes
|
Thank you for your contribution, hope it'll help a lot of users. It was a pleasure working with you! |
Add support for ignoring elements in encoded CBOR that are not present in the Kotlin class representation.
Closes #935
How it works
When an unknown element is encountered,
decoder.skipElement()is invoked until a known element is encountered.Skipping of an element begins at the first byte of the element's value.
Bytes are processed to determine element type (and corresponding length), to ultimately
determine how many bytes can be skipped. The general process is demonstrated by the following pseudo-code:
lengthStackrepresents stack of "number of elements" at each depth.Examples
The following examples show the hex notation of CBOR encoded payloads with the cooresponding
lengthStackafter processing the bytes in that row.Array of objects
The encoded CBOR
82A16161636B656BA16161636B656Bwould undergo the followingskipElement(decoding) process:lengthStack82[2]2(child count of82:A1,A1) on stackA1[2, 2]2(child count ofA1: "a", "kek") on stack61 61[2, 1]63 6B656B[1]A1, decrement stack itemsA1[1, 2]2(child count ofA1: "a", "kek") on stack61 61[1, 1]63 6B656B[]A1, decremented stack itemsArray with nested indefinite length array
The encoded CBOR
829FA16161636B656BFF6162would undergo the followingskipElement(decoding) process:lengthStack82[2]2(child count of82:array(*), "b") on stack9F[2, -1]A1[2, -1, 2]2(child count ofA1: "a", "kek") on stack61 61[2, -1, 1]63 6B656B[2, -1]A1, decrement stack items (from end to start; stopped at-1)FF[1]9F, decrement stack items61 62[]82, decremented stack items