-
Notifications
You must be signed in to change notification settings - Fork 909
GODRIVER-2252 Don't call UnmarshalBSON for pointer values if the BSON field value is empty. #833
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GODRIVER-2252 Don't call UnmarshalBSON for pointer values if the BSON field value is empty. #833
Conversation
… field value is empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting bug... Copying the behavior of encoding/json
seems good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely done. The test cases with nil, zero, and non-zero pointers look like a good addition.
// field value is "nil", we set "nil" here and don't call UnmarshalBSON. This behavior matches | ||
// the behavior of the Go "encoding/json" unmarshaler when the target Go value is a pointer and | ||
// the JSON field value is "null". | ||
if val.Kind() == reflect.Ptr && len(src) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check on L1505 checks whether the target value can be changed. Should this be moved below that block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a good idea. Will update.
return err | ||
} | ||
|
||
// If the target Go value is a pointer and the BSON field value is empty, set the value to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No suggested change, but noting for consideration.
The original example marshaled the value to BSON Null. This seems like the desired behavior when the Go value is a pointer type:
- nil marhals to BSON Null.
- BSON Null unmarshals to nil.
From https://bsonspec.org/spec.html there three values that would unmarshal with an empty value:
- BSON Null
- Min Key
- Max Key
I do not know if there is any use case for unmarshaling a Min Key or Max Key.
If Min Key or Max Key were unmarshaled to a Go value with a pointer type, setting the target to nil seems reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great find! I did some testing and found that if you define a custom type with base type primitive.MinKey
or primitive.MaxKey
, you lose the special BSON marshaler behavior for the primitive.MinKey
and primitive.MaxKey
types. That means it's not possible to define a custom UnmarshalBSON
function for the primitive.MinKey
or primitive.MaxKey
types while maintaining the special marshaling behavior (you can't define additional functions for the primitive.MinKey
or primitive.MaxKey
types outside of the primitive
package).
For example, if you encode a struct containing a *primitive.MinKey
, you get a BSON document with field type "\xff" (the "min key" field type):
type myStruct struct {
Min *primitive.MinKey
}
func main() {
b, _ := bson.Marshal(myStruct{Min: &primitive.MinKey{}})
fmt.Printf("%+q\n", b)
}
// Prints "\n\x00\x00\x00\xffmin\x00\x00"
Notice the "\xff" before "min", which is the encoding for the "min key" type.
If you define a new type like myMinKey
based on primitive.MinKey
and encode a similar struct, the field is encoded as a "document" type instead of a "min key" type.
type myMinKey primitive.MinKey
type myStruct struct {
Min *myMinKey
}
func main() {
b, _ := bson.Marshal(myStruct{Min: &myMinKey{}})
fmt.Printf("%+q\n", b)
}
// Prints "\x0f\x00\x00\x00\x03min\x00\x05\x00\x00\x00\x00\x00"
Notice the "\x03" before "min", which is the BSON encoding for a "document" type. Also, the "\x05\x00\x00\x00" after "min" is the encoding for an empty document value, not the zero-length value of a "min key" type.
tldr: While the "min key" and "max key" BSON types use a zero-length value like the "null" type, a user cannot define a custom UnmarshalBSON
function for the primitive.MinKey
/primitive.MaxKey
types while maintaining the special marshaling/unmarshaling behavior, so the change here shouldn't impact those types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah, interesting. TIL that a custom type definition of primitive.MinKey
or primitive.MaxKey
changes the marshal behavior. Thanks for the explanation!
got := reflect.New(tc.sType).Interface() | ||
err := UnmarshalWithRegistry(reg, tc.data, got) | ||
err := UnmarshalWithRegistry(DefaultRegistry, tc.data, got) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this uses the DefaultRegistry
, can TestUnmarshalWithRegistry
be removed in favor of TestUnmarshal
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd actually prefer to leave that there for coverage. Currently it's redundant, but there's nothing that says Unmarshal
needs to continue calling UnmarshalWithRegistry
with DefaultRegistry
. If that ever changed, we'd potentially lose coverage of UnmarshalWithRegistry
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Leaving TestUnmarshalWithRegistry
seems preferable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM mod Kevin's comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
… field value is empty. (#833)
GODRIVER-2252
When unmarshaling BSON, If the target Go value is a pointer, has a custom
UnmarshalBSON
function, and the BSON field value is empty, set the Go value to the zero value of the pointer (nil
) and don't callUnmarshalBSON
.UnmarshalBSON
has no way to change the pointer value from within the function (only the value at the pointer address), so it can't set the pointer tonil
itself. Since the most common Go value for an empty BSON field value isnil
, we set the Go value tonil
and don't callUnmarshalBSON
. This behavior matches the behavior of the Go"encoding/json"
unmarshaler when the target Go value is a pointer, has a customUnmarshalJSON
function, and the JSON field value isnull
.See this StackOverflow post for a discussion about the same behavior of the Go
"encoding/json"
unmarshaler.