-
Notifications
You must be signed in to change notification settings - Fork 149
RUST-1998 Remove lossy utf8 as a decoder option #550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3df66fd to
328ac6d
Compare
| } | ||
| } | ||
|
|
||
| #[deprecated(since = "2.3.0", note = "Use try_to_rfc3339_string instead.")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is completely unrelated but I happened to notice it and figured we should remove deprecated methods while we're doing a major version release. I did a grep and didn't find any others.
src/raw/document.rs
Outdated
| } | ||
|
|
||
| /// Copy this into a [`Document`], returning an error if invalid BSON is encountered. | ||
| pub fn to_document(&self) -> Result<Document> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved this from RawDocumentBuf since that was calling the TryInto impl on RawDocument anyway and this way both types can use it (RawDocumentBuf derefs to this one).
src/raw/error.rs
Outdated
| Self::MalformedValue { message: r_message }, | ||
| ) => l_message == r_message, | ||
| (Self::Utf8EncodingError, Self::Utf8EncodingError) => true, | ||
| (Self::DeError(_), Self::DeError(_)) => true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crate::de:Error doesn't impl PartialEq because it wraps std::io::Error, which also doesn't, hence the manual impl. I figure this whole situation can be revisted for RUST-1406.
This is all needed so I can use crate::de::reader_to_vec in RawDocumentBuf::from_reader; the former uses the crate::de error hierarchy, the latter the crate::raw one.
| } | ||
|
|
||
| pub(crate) fn value_utf8_lossy(&self) -> Result<Option<Utf8LossyBson<'a>>> { | ||
| pub fn value_utf8_lossy(&self) -> Result<RawBson> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opted to expose this method rather than the lower-level value_utf8_lossy_inner because in my estimation the confluence between needing to deal with invalid utf8 and needing to save the memory cost of copying a single element is very small, and outside of that this is a strictly more useful interface.
isabelatkinson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, it'll be equivalent performance-wise, since using the wrapper type just enables behavior the same way |
96b7844 to
6930185
Compare
dc3585d to
8679191
Compare
RUST-1998
This removes lossy utf8 decoding as a public option for the bson serde decoder, and as a consequence many of the API entry points. Dealing with lossy utf8 is now done in one of three ways depending on the specific need:
Utf8LossyDeserializationhelper type, if the user is going viaserdeand knows that particular fields might have bad data;RawElement::value_utf8_lossymethod, which allows fine-grained iteration over potentially-invalid raw data, orRawDocument::to_document_utf8_lossymethod, which is a convenience wrapper over the above to bulk-convert an entireRawDocumentat once.