Skip to content

MONGOID-5158 - BSON::ObjectId#as_json and #to_json should serialize as a String without "$oid" #5054

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

johnnyshields
Copy link
Contributor

@johnnyshields johnnyshields commented Aug 21, 2021

The Change

Currently, BSON::ObjectId#as_json will return { "$oid" => "03d623ca58edf9046e207565" }

This PR does what 99% of users prefer: return a plain String "03d623ca58edf9046e207565"

I've included a "Breaking change" notice in the docs for this.

Why

While $oid may be the correct way to represent an ObjectID in strict "BSON", it is not a practical way to represent an ObjectID in "JSON" as used in real world apps. The standard use case for JSON (.as_json method) is to serialize objects for HTTP APIs. The primary goal of Mongoid is for us as developers not to have to worry about BSON internal representations; instead we think in terms of the Ruby app domain. (Mongoid / Mongo Ruby Driver handles the BSON nuts-and-bolts for us.)

Other Ideas

We could also add BSON::ObjectId#as_bson and Object#as_bson (which delegates to .as_json in all cases except ObjectId.) I don't think it's particularly useful, but I'd be happy to implement it if the Mongo team would like. (We'd probably want to call mongoize or evolve as well...)

More Background

The current behavior drives Rails users absolutely crazy as you can see in the following issues, and nearly everyone ends up monkey-patching this method after hours of debugging frustration. Here are some a few examples out of hundreds:

Incidentally, Mongoid 3.1 and earlier used to serialize this as a String; when it changed to $oid it broke my and many other users apps--this unexpected behavior causes lots of bugs. When I looked up the commit git blame I found this:

image

@johnnyshields johnnyshields changed the title BSON::ObjectId .as_json / .to_json should serialize as a String without "$oid" MONGOID-???? BSON::ObjectId .as_json / .to_json should serialize as a String without "$oid" Aug 21, 2021
@johnnyshields johnnyshields changed the title MONGOID-???? BSON::ObjectId .as_json / .to_json should serialize as a String without "$oid" MONGOID-???? - BSON::ObjectId .as_json / .to_json should serialize as a String without "$oid" Aug 21, 2021
@johnnyshields johnnyshields changed the title MONGOID-???? - BSON::ObjectId .as_json / .to_json should serialize as a String without "$oid" MONGOID-???? - BSON::ObjectId#as_json and #to_json should serialize as a String without "$oid" Aug 21, 2021
@p-mongo
Copy link
Contributor

p-mongo commented Aug 22, 2021

Hi Johnny,

This is a non-trivial change to make and it probably belongs in bson-ruby rather than Mongoid. I've done some investigation to establish the history behind the current behavior:

The as_json method was originally added in dffffa9, and at that point it returned the hex representation of the object id. This was in 2010.

In 2013, as_json was added to bson-ruby in mongodb/bson-ruby@1e96301, and this implementation returned the oid extended json syntax.

Subsequently in 3c3496a (2 months after the bson-ruby commit), Mongoid's as_json was changed to also return the oid and the message indicates this was done to not alter the driver's behavior (bson-ruby I suppose can be considered part of the driver).

Since this, and as of today, I believe Mongoid's definition of as_json on ObjectId is unnecessary since it's the same as the one bson-ruby provides and lacking its own definition, Mongoid would inherit the bson-ruby's one. I believe that Mongoid's definition of as_json for ObjectId should be removed completely, since it is no longer Mongoid's domain to provide this definition (it is bson-ruby's).

Similarly, changing this method would need to be done in bson-ruby, not in Mongoid. Otherwise applications that use the driver would end up having their behavior changed when Mongoid is loaded, which I would say is undesirable.

The behavior of as_json itself is, independently, a complicated question. Rails uses this method to provide, in my understanding, a one-way serialization of Ruby objects to JSON. bson-ruby (in 2013) reused this method to serialize various objects to MongoDB extended json, which in particular has an expectation of round-tripping. The original extended json serialization implementation wasn't complete though, it was mostly finished in https://jira.mongodb.org/browse/RUBY-2066 and as part of that ticket the method that is used for extended json serialization became as_extended_json rather than as_json to avoid the various problems of conflating one-way serialization to JSON that is conveniently readable by frontends/other programs and round-trippable serialization to extended JSON. This was done in mongodb/bson-ruby@9751b24.

So...

Given that bson-ruby now uses as_extended_json for serialization to extended json, I think the next step would be to investigate what issues exist with the existing as_json implementations when they are used to produce something that ends up in an application's HTTP responses. I do agree that the oid serialization of ObjectId is problematic in many environments, and I support resolving this issue in some manner. A related question is whether the hex string output as proposed in this PR would be correctly handled on the input side (by both applications and Mongoid, with things like nested attributes).

I support in principle making the change proposed in this PR in bson-ruby, but at this point I believe it requires more research to determine impact and any related changes that should also be made.

@johnnyshields
Copy link
Contributor Author

johnnyshields commented Aug 22, 2021

I believe it requires more research to determine impact and any related changes that should also be made.

Since #as_json is only used externally (in app code)--it is not called internally within BSON / Ruby Driver / Mongoid--I think there will be basically no impact of this change. As long as users are notified in the release notes, it should be fine.

In the release notes I've included a patch to revert to the $oid behavior for the < 0.1% of users who somehow depend on it. Nearly all apps are either overriding this in the serializer and/or monkey patching it (I monkey patch it in my app.)

@p-mongo
Copy link
Contributor

p-mongo commented Aug 23, 2021

I created https://jira.mongodb.org/browse/MONGOID-5162 for removing the method from Mongoid.

@p-mongo
Copy link
Contributor

p-mongo commented Aug 23, 2021

Since #as_json is only used externally (in app code)--it is not called internally within BSON / Ruby Driver / Mongoid--I think there will be basically no impact of this change.

Can you elaborate how you arrived at this conclusion? My expectation is this method is used in the vast majority of Ruby and Mongoid applications, and it is not at all clear to me that altering its behavior will produce "basically no impact".

@johnnyshields
Copy link
Contributor Author

johnnyshields commented Aug 23, 2021

Can you elaborate how you arrived at this conclusion

Yes, the BSON::ObjectId#as_json method is overridden in 99.99% of applications that use the as_json function for API serialization. I cannot imagine any API would want to return id as { "$oid": "..." } externally.

I will ask you the opposite: please name one use-case for the { "$oid": "..." } format? Even supposing you were string-building MongoDB queries (which is a huge security no-no) the string-only format would still work fine!

You can see clearly in Stack Overflow and in Mongoid repo you're getting 1000s of complaints for the existing behavior. If you change it (with a proper change notice) I cannot imagine you will receive a single complaint.

image

@johnnyshields
Copy link
Contributor Author

johnnyshields commented Aug 24, 2021

@p-mongo If there is serious concern that this may break applications, may I suggest the following:

  1. In Mongoid 7.4, add a new temporary global config :serialize_bson_object_id_as_string. Default it to false.
  2. If BSON::ObjectId#as_json is called in Mongoid 7.4 with the above config false, log a deprecation warning "This will be changed to output a String in Mongoid 8.0. Please set serialize_bson_object_id_as_string to true to get the new behavior.
  3. In Mongoid 8.0, remove the config and change to the string behavior permanently.
  4. We can discuss what to do in the BSON library. In order to achieve the above we may need Mongoid to override BSON until Mongoid 8.0.

Would this be acceptable to you? If so, I'm glad to raise the PRs for both the deprecation in 7.4 and the removal in 8.0.

@p-mongo
Copy link
Contributor

p-mongo commented Aug 24, 2021

As I mentioned earlier, I believe that Mongoid shouldn't have a ObjectId#as_json implementation at all. To that end I would appreciate a PR for https://jira.mongodb.org/browse/MONGOID-5162 removing it.

@johnnyshields
Copy link
Contributor Author

OK sure... but then what? What's the plan to switch #as_json so that from the end-user perspective it returns a string? Shall I also raise a PR to do that in the BSON gem?

@p-mongo
Copy link
Contributor

p-mongo commented Aug 25, 2021

As the next step I suggest defining what #as_json is supposed to do, for all classes in bson-ruby that this method is presently defined, which are:

lib/bson/binary.rb:    def as_json(*args)
lib/bson/code.rb:    def as_json(*args)
lib/bson/code_with_scope.rb:    def as_json(*args)
lib/bson/db_pointer.rb:    def as_json(*args)
lib/bson/decimal128.rb:    def as_json(*args)
lib/bson/max_key.rb:    def as_json(*args)
lib/bson/min_key.rb:    def as_json(*args)
lib/bson/object_id.rb:    def as_json(*args)
lib/bson/regexp.rb:    def as_json(*args)
lib/bson/regexp.rb:      def as_json(*args)
lib/bson/timestamp.rb:    def as_json(*args)

Notably absent from that list are Time, Date and DateTime (which define #as_extended_json but not #as_json).

The Decimal128 implementation is perhaps the one which could be improved, presently it produces extended json which people may not want. (https://github.com/mongodb/bson-ruby/blob/master/lib/bson/decimal128.rb#L67)

A related question is whether all of the other as_json methods should be retained with their current implementations, IF the purpose of as_json is not to in fact produce extended json but do something else.

Once this is decided the next step would be to implement the change and come up with the transition plan. Perhaps this could be the change that gets bson-ruby 5.0 out into the world.

@johnnyshields
Copy link
Contributor Author

Replaced by #5057.

I'll also be raising a bson-ruby PR soon. Thanks

@johnnyshields
Copy link
Contributor Author

@p-mongo BSON library discussion here: mongodb/bson-ruby#238

@p-mongo p-mongo changed the title MONGOID-???? - BSON::ObjectId#as_json and #to_json should serialize as a String without "$oid" MONGOID-5158 - BSON::ObjectId#as_json and #to_json should serialize as a String without "$oid" Aug 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants