Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: fix StringBytes::Write() #1042

Merged
merged 8 commits into from
Mar 5, 2015
Merged

src: fix StringBytes::Write() #1042

merged 8 commits into from
Mar 5, 2015

Conversation

bnoordhuis
Copy link
Member

This is the fix for #1024. I still need to turn the last four commits into a coherent narrative.

I'm going to go over src/string_bytes.cc one more time because I suspect there are still more bugs lurking.

R=@trevnorris

#include "string_bytes.h"

namespace node {

Utf8Value::Utf8Value(v8::Isolate* isolate, v8::Handle<v8::Value> value)
: length_(0), str_(nullptr) {
: length_(0), str_(str_st_) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lint didn't catch this before?

@trevnorris
Copy link
Contributor

Excellent job. LGTM.

PR-URL: nodejs#1042
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
Large external two-byte strings reported their character length instead
of their byte length, throwing off the garbage collector heuristic by
a factor of two.

PR-URL: nodejs#1042
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
Introduced in joyent/node v0.10 as a backwards compatibility measure.
It's an ugly hack and allowing invalid UTF-8 is not a good idea in the
first place, remove it.

PR-URL: nodejs#1042
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
* Remove kStorageSize constant.

* Remove superfluous local variable and reinterpret_cast.

* Reorder data members so the length field and data pointer
  (if not the data itself) fit in a single cache line.

PR-URL: nodejs#1042
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
Rename `val_` to `string`.  The underscore suffix is normally reserved
for data members, not locals, and it's not a great name in the first
place.

PR-URL: nodejs#1042
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
Make the algorithm that creates the big input strings a little easier
to comprehend.  No functional changes, the string lengths are unchanged.

PR-URL: nodejs#1042
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
Make StringBytes::GetExternalParts() return the byte length for two-byte
strings, not the character length.  Its callers operate on bytes, not
characters.

This also fixes StringBytes::Size() reporting only half of the actual
number of bytes for external two-byte strings.

PR-URL: nodejs#1042
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
StringBytes::Write() did a plain memcpy() when is_extern is true but
that's wrong when the source is a two-byte string and the destination
a one-byte or UTF-8 string.

The impact is limited to strings > 1,031,913 bytes because those are
normally the only strings that are externalized, although the use of
the 'externalize strings' extension (--expose_externalize_string) can
also trigger it.

This commit also cleans up the bytes versus characters confusion in
StringBytes::Write() because that was closely intertwined with the
UCS-2 encoding regression.  One wasn't fixable without the other.

Fixes: nodejs#1024
Fixes: nodejs/node-v0.x-archive#8683
PR-URL: nodejs#1042
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
@bnoordhuis bnoordhuis closed this Mar 5, 2015
@bnoordhuis bnoordhuis deleted the issue1024 branch March 5, 2015 19:45
@bnoordhuis bnoordhuis merged commit 1640ded into nodejs:v1.x Mar 5, 2015
@rvagg rvagg mentioned this pull request Mar 5, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants