Skip to content

GODRIVER-2252 Don't call UnmarshalBSON for pointer values if the BSON field value is empty. #833

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions bson/bsoncodec/default_value_decoders.go
Original file line number Diff line number Diff line change
Expand Up @@ -1504,6 +1504,18 @@ func (dvd DefaultValueDecoders) UnmarshalerDecodeValue(dc DecodeContext, vr bson
return err
}

// If the target Go value is a pointer and the BSON field value is empty, set the value to the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No suggested change, but noting for consideration.

The original example marshaled the value to BSON Null. This seems like the desired behavior when the Go value is a pointer type:

  • nil marhals to BSON Null.
  • BSON Null unmarshals to nil.

From https://bsonspec.org/spec.html there three values that would unmarshal with an empty value:

  • BSON Null
  • Min Key
  • Max Key

I do not know if there is any use case for unmarshaling a Min Key or Max Key.

If Min Key or Max Key were unmarshaled to a Go value with a pointer type, setting the target to nil seems reasonable.

Copy link
Collaborator Author

@matthewdale matthewdale Jan 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great find! I did some testing and found that if you define a custom type with base type primitive.MinKey or primitive.MaxKey, you lose the special BSON marshaler behavior for the primitive.MinKey and primitive.MaxKey types. That means it's not possible to define a custom UnmarshalBSON function for the primitive.MinKey or primitive.MaxKey types while maintaining the special marshaling behavior (you can't define additional functions for the primitive.MinKey or primitive.MaxKey types outside of the primitive package).

For example, if you encode a struct containing a *primitive.MinKey, you get a BSON document with field type "\xff" (the "min key" field type):

type myStruct struct {
	Min *primitive.MinKey
}
func main() {
	b, _ := bson.Marshal(myStruct{Min: &primitive.MinKey{}})
	fmt.Printf("%+q\n", b)
}

// Prints "\n\x00\x00\x00\xffmin\x00\x00"

Notice the "\xff" before "min", which is the encoding for the "min key" type.

If you define a new type like myMinKey based on primitive.MinKey and encode a similar struct, the field is encoded as a "document" type instead of a "min key" type.

type myMinKey primitive.MinKey

type myStruct struct {
	Min *myMinKey
}

func main() {	
	b, _ := bson.Marshal(myStruct{Min: &myMinKey{}})
	fmt.Printf("%+q\n", b)
}

// Prints "\x0f\x00\x00\x00\x03min\x00\x05\x00\x00\x00\x00\x00"

Notice the "\x03" before "min", which is the BSON encoding for a "document" type. Also, the "\x05\x00\x00\x00" after "min" is the encoding for an empty document value, not the zero-length value of a "min key" type.

tldr: While the "min key" and "max key" BSON types use a zero-length value like the "null" type, a user cannot define a custom UnmarshalBSON function for the primitive.MinKey/primitive.MaxKey types while maintaining the special marshaling/unmarshaling behavior, so the change here shouldn't impact those types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah, interesting. TIL that a custom type definition of primitive.MinKey or primitive.MaxKey changes the marshal behavior. Thanks for the explanation!

// zero value of the pointer (nil) and don't call UnmarshalBSON. UnmarshalBSON has no way to
// change the pointer value from within the function (only the value at the pointer address),
// so it can't set the pointer to "nil" itself. Since the most common Go value for an empty BSON
// field value is "nil", we set "nil" here and don't call UnmarshalBSON. This behavior matches
// the behavior of the Go "encoding/json" unmarshaler when the target Go value is a pointer and
// the JSON field value is "null".
if val.Kind() == reflect.Ptr && len(src) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check on L1505 checks whether the target value can be changed. Should this be moved below that block?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a good idea. Will update.

val.Set(reflect.Zero(val.Type()))
return nil
}

fn := val.Convert(tUnmarshaler).MethodByName("UnmarshalBSON")
errVal := fn.Call([]reflect.Value{reflect.ValueOf(src)})[0]
if !errVal.IsNil() {
Expand Down
31 changes: 8 additions & 23 deletions bson/decoder_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ import (
"testing"

"github.com/google/go-cmp/cmp"
"github.com/stretchr/testify/assert"
"go.mongodb.org/mongo-driver/bson/bsoncodec"
"go.mongodb.org/mongo-driver/bson/bsonrw"
"go.mongodb.org/mongo-driver/bson/bsonrw/bsonrwtest"
Expand All @@ -21,7 +22,7 @@ import (
)

func TestBasicDecode(t *testing.T) {
for _, tc := range unmarshalingTestCases {
for _, tc := range unmarshalingTestCases() {
t.Run(tc.name, func(t *testing.T) {
got := reflect.New(tc.sType).Elem()
vr := bsonrw.NewBSONDocumentReader(tc.data)
Expand All @@ -30,34 +31,22 @@ func TestBasicDecode(t *testing.T) {
noerr(t, err)
err = decoder.DecodeValue(bsoncodec.DecodeContext{Registry: reg}, vr, got)
noerr(t, err)

if !reflect.DeepEqual(got.Addr().Interface(), tc.want) {
t.Errorf("Results do not match. got %+v; want %+v", got, tc.want)
}
assert.Equal(t, tc.want, got.Addr().Interface(), "Results do not match.")
})
}
}

func TestDecoderv2(t *testing.T) {
t.Run("Decode", func(t *testing.T) {
for _, tc := range unmarshalingTestCases {
for _, tc := range unmarshalingTestCases() {
t.Run(tc.name, func(t *testing.T) {
got := reflect.New(tc.sType).Interface()
vr := bsonrw.NewBSONDocumentReader(tc.data)
var reg *bsoncodec.Registry
if tc.reg != nil {
reg = tc.reg
} else {
reg = DefaultRegistry
}
dec, err := NewDecoderWithContext(bsoncodec.DecodeContext{Registry: reg}, vr)
dec, err := NewDecoderWithContext(bsoncodec.DecodeContext{Registry: DefaultRegistry}, vr)
noerr(t, err)
err = dec.Decode(got)
noerr(t, err)

if !reflect.DeepEqual(got, tc.want) {
t.Errorf("Results do not match. got %+v; want %+v", got, tc.want)
}
assert.Equal(t, tc.want, got, "Results do not match.")
})
}
t.Run("lookup error", func(t *testing.T) {
Expand All @@ -70,9 +59,7 @@ func TestDecoderv2(t *testing.T) {
noerr(t, err)
want := bsoncodec.ErrNoDecoder{Type: reflect.TypeOf(cdeih)}
got := dec.Decode(&cdeih)
if !cmp.Equal(got, want, cmp.Comparer(compareErrors)) {
t.Errorf("Received unexpected error. got %v; want %v", got, want)
}
assert.Equal(t, want, got, "Received unexpected error.")
})
t.Run("Unmarshaler", func(t *testing.T) {
testCases := []struct {
Expand Down Expand Up @@ -191,9 +178,7 @@ func TestDecoderv2(t *testing.T) {
err = dec.Decode(&got)
noerr(t, err)
want := foo{Item: "canvas", Qty: 4, Bonus: 2}
if !reflect.DeepEqual(got, want) {
t.Errorf("Results do not match. got %+v; want %+v", got, want)
}
assert.Equal(t, want, got, "Results do not match.")
})
t.Run("Reset", func(t *testing.T) {
vr1, vr2 := bsonrw.NewBSONDocumentReader([]byte{}), bsonrw.NewBSONDocumentReader([]byte{})
Expand Down
48 changes: 11 additions & 37 deletions bson/unmarshal_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,64 +10,42 @@ import (
"reflect"
"testing"

"github.com/google/go-cmp/cmp"
"github.com/stretchr/testify/assert"
"go.mongodb.org/mongo-driver/bson/bsoncodec"
"go.mongodb.org/mongo-driver/bson/bsonrw"
"go.mongodb.org/mongo-driver/internal/testutil/assert"
"go.mongodb.org/mongo-driver/x/bsonx/bsoncore"
)

func TestUnmarshal(t *testing.T) {
for _, tc := range unmarshalingTestCases {
for _, tc := range unmarshalingTestCases() {
t.Run(tc.name, func(t *testing.T) {
if tc.reg != nil {
t.Skip() // test requires custom registry
}
got := reflect.New(tc.sType).Interface()
err := Unmarshal(tc.data, got)
noerr(t, err)
if !cmp.Equal(got, tc.want) {
t.Errorf("Did not unmarshal as expected. got %v; want %v", got, tc.want)
}
assert.Equal(t, tc.want, got, "Did not unmarshal as expected.")
})
}
}

func TestUnmarshalWithRegistry(t *testing.T) {
for _, tc := range unmarshalingTestCases {
for _, tc := range unmarshalingTestCases() {
t.Run(tc.name, func(t *testing.T) {
var reg *bsoncodec.Registry
if tc.reg != nil {
reg = tc.reg
} else {
reg = DefaultRegistry
}
got := reflect.New(tc.sType).Interface()
err := UnmarshalWithRegistry(reg, tc.data, got)
err := UnmarshalWithRegistry(DefaultRegistry, tc.data, got)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this uses the DefaultRegistry, can TestUnmarshalWithRegistry be removed in favor of TestUnmarshal?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd actually prefer to leave that there for coverage. Currently it's redundant, but there's nothing that says Unmarshal needs to continue calling UnmarshalWithRegistry with DefaultRegistry. If that ever changed, we'd potentially lose coverage of UnmarshalWithRegistry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Leaving TestUnmarshalWithRegistry seems preferable.

noerr(t, err)
if !cmp.Equal(got, tc.want) {
t.Errorf("Did not unmarshal as expected. got %v; want %v", got, tc.want)
}
assert.Equal(t, tc.want, got, "Did not unmarshal as expected.")
})
}
}

func TestUnmarshalWithContext(t *testing.T) {
for _, tc := range unmarshalingTestCases {
for _, tc := range unmarshalingTestCases() {
t.Run(tc.name, func(t *testing.T) {
var reg *bsoncodec.Registry
if tc.reg != nil {
reg = tc.reg
} else {
reg = DefaultRegistry
}
dc := bsoncodec.DecodeContext{Registry: reg}
dc := bsoncodec.DecodeContext{Registry: DefaultRegistry}
got := reflect.New(tc.sType).Interface()
err := UnmarshalWithContext(dc, tc.data, got)
noerr(t, err)
if !cmp.Equal(got, tc.want) {
t.Errorf("Did not unmarshal as expected. got %v; want %v", got, tc.want)
}
assert.Equal(t, tc.want, got, "Did not unmarshal as expected.")
})
}
}
Expand All @@ -80,9 +58,7 @@ func TestUnmarshalExtJSONWithRegistry(t *testing.T) {
err := UnmarshalExtJSONWithRegistry(DefaultRegistry, data, true, &got)
noerr(t, err)
want := teststruct{1}
if !cmp.Equal(got, want) {
t.Errorf("Did not unmarshal as expected. got %v; want %v", got, want)
}
assert.Equal(t, want, got, "Did not unmarshal as expected.")
})

t.Run("UnmarshalExtJSONInvalidInput", func(t *testing.T) {
Expand Down Expand Up @@ -165,9 +141,7 @@ func TestUnmarshalExtJSONWithContext(t *testing.T) {
dc := bsoncodec.DecodeContext{Registry: DefaultRegistry}
err := UnmarshalExtJSONWithContext(dc, tc.data, true, got)
noerr(t, err)
if !cmp.Equal(got, tc.want) {
t.Errorf("Did not unmarshal as expected. got %+v; want %+v", got, tc.want)
}
assert.Equal(t, tc.want, got, "Did not unmarshal as expected.")
})
}
}
Expand Down
Loading