Skip to content

Conversation

kjg
Copy link
Contributor

@kjg kjg commented May 16, 2016

instead of an empty string so that it is clear that there was some sort of
encoding issue as opposed silently dropping data

(as discussed in #978)

@jeremy
Copy link
Collaborator

jeremy commented May 16, 2016

Nice @kjg. Thanks for revisiting!

@kjg kjg force-pushed the unicode_replacement_char branch from 7284935 to 0a7f1ae Compare May 17, 2016 14:19
@kjg
Copy link
Contributor Author

kjg commented Aug 15, 2016

@jeremy I'd love to get #1000 merged in so I can remove the workarounds from my code base. This change is somewhat of a pre-req for that one. Can you please let me know what needs to be done to move this change forward? Thanks!

@kjg
Copy link
Contributor Author

kjg commented Sep 16, 2016

@jeremy Can we include this in v2.7?

@jeremy jeremy added this to the 2.7.0 milestone Sep 16, 2016
@kjg kjg force-pushed the unicode_replacement_char branch from 0a7f1ae to 4ac2141 Compare September 21, 2016 16:32
instead of an empty string so that it is clear that there was some sort of
encoding issue as opposed silently dropping data

This is only easily accomplished on ruby 1.9 and above, so ruby 1.8.x will
keep the empty string behavior.
@kjg kjg force-pushed the unicode_replacement_char branch from 4ac2141 to d958497 Compare January 23, 2017 20:03
@kjg
Copy link
Contributor Author

kjg commented Jan 23, 2017

@jeremy is there anything I can do to help get this merged in? Does this just need to wait for a 2.7 branch to be cut off of master? If you're 👍 on this, please let me know what else I can do to help! Thanks!

@jeremy
Copy link
Collaborator

jeremy commented Jan 27, 2017

Merged @ 8e15ff5 ❤️

decoded.valid_encoding? ? decoded : decoded.encode(Encoding::UTF_16LE, :invalid => :replace, :replace => "").encode(Encoding::UTF_8)
rescue Encoding::UndefinedConversionError, ArgumentError, Encoding::ConverterNotFoundError
warn "Encoding conversion failed #{$!}"
str.dup.force_encoding(Encoding::UTF_8)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this rescue clause also caught exceptions in e.g. Ruby19.decode_base64 and charset_encoder.encode up above.

@jeremy jeremy closed this Jan 27, 2017
@kjg kjg deleted the unicode_replacement_char branch January 27, 2017 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants