-
-
Notifications
You must be signed in to change notification settings - Fork 102
Fix invalid byte sequence on EncodedString#split #172
Fix invalid byte sequence on EncodedString#split #172
Conversation
a07480f
to
d9645c8
Compare
So this makes me a little uncomfortable given you are assuming that the bad bytes can be scrubbed by converting to UTF-8 and back, if this is an incompatible encoding with UTF-8 then won't this destroy the original encoding? Also can you remove the whitespace changes from the PR |
So, the whitespace changes look sorta terrible in the unified diff, but if you split it or look at it alone, you'll see I'm actually fixing inconsistent whitespace (and some spelling, but that's unrelated to your comment). I'm going to push a better fix on top of this momentarily. |
Weird, that passed in https://travis-ci.org/rspec/rspec-support/builds/49911552 |
If this looks good I"ll rebase down to one commit. |
52b1f5d
to
8260a59
Compare
Ok, I think this is it. |
I'm not sure what to do about the JRuby failure, if it's expected behavior, or a bug in JRuby. It is introduced in this PR. |
I mean, it's not totally a failure, just a different behavior. The |
LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 \
rvm ruby-1.9.2-p330,ruby-1.9.3-p551,ruby-2.0.0-p598,ruby-2.1.5,ruby-2.2.0,jruby-1.7.18,rbx-2.2.2 do \
ruby -e 'p [RUBY_VERSION, RUBY_ENGINE, Encoding.default_external, __ENCODING__] << "\x80".force_encoding("US-ASCII").chars.map{|char| char.valid_encoding? ? char : "?" }.join.encode("UTF-8")'
for version in 1.9 2.0; do \
export JRUBY_OPTS="-Xcompat.version=${version}" ; \
LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 \
rvm jruby-1.7.18 do \
ruby -e 'p [RUBY_VERSION, RUBY_ENGINE, Encoding.default_external, __ENCODING__] << "\x80".force_encoding("US-ASCII").chars.map{|char| char.valid_encoding? ? char : "?" }.join.encode("UTF-8")';
done
|
8260a59
to
125634b
Compare
@JonRowe @myronmarston Are y'all ok with this? |
So the test does nothing on JRuby? I think it should be pending if thats the case, that way we can have one spec, use the |
# https://twitter.com/nalsh/status/553413844685438976 | ||
# | ||
# For example, given: | ||
# "\x80".force_encoding("Emacs-Mule").encode(:invalid => :replace).bytes.to_a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the rest of this file this line needs one more " " added at the start (so indented by two spaces not 1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch
Sorry nitpicking over the whitespace again because it should either be all fixed or none :P |
@JonRowe Not at all. re: the JRuby spec, you think I should remove the 'JRUBY' commit and add |
Yes. |
Map string char with invalid encoding to '?' Format identical string expectation to read easier Refs: - rspec/rspec-core#1760 - via rspec#134 Set to pending for JRuby, opened issue jruby/jruby#2580 that JRuby is the only Ruby that returns "\x80" in place of "?"
125634b
to
4bd437a
Compare
Ok, I think I fixed the whitespace and it should be good. At some point though, I start expecting to find I missed something and I'm past that point :) |
Fix invalid byte sequence on EncodedString#split
Fix invalid byte sequence on EncodedString#split
Picked to |
great! back to rspec-core now |
…oded_string_split_argument_error Fix invalid byte sequence on EncodedString#split --- This commit was imported from rspec/rspec-support@846bdbe.
[skip ci] --- This commit was imported from rspec/rspec-support@b989e3a.
From: