Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update String#trim #316

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions es5-shim.js
Original file line number Diff line number Diff line change
Expand Up @@ -1578,14 +1578,14 @@ defineProperties(StringPrototype, {

// ES5 15.5.4.20
// whitespace from: http://es5.github.io/#x15.5.4.20
var ws = '\x09\x0A\x0B\x0C\x0D\x20\xA0\u1680\u180E\u2000\u2001\u2002\u2003' +
'\u2004\u2005\u2006\u2007\u2008\u2009\u200A\u202F\u205F\u3000\u2028' +
'\u2029\uFEFF';
var zeroWidth = '\u200b';
var wsRegexChars = '[' + ws + ']';
var trimBeginRegexp = new RegExp('^' + wsRegexChars + wsRegexChars + '*');
var trimEndRegexp = new RegExp(wsRegexChars + wsRegexChars + '*$');
var hasTrimWhitespaceBug = StringPrototype.trim && (ws.trim() || !zeroWidth.trim());
var ws = '\t\n\v\f\r \xA0\u1680\u180E\u2000\u2001\u2002\u2003\u2004\u2005' +
'\u2006\u2007\u2008\u2009\u200A\u200B\u2028\u2029\u202F\u205F\u3000\uFEFF';
var nxtLine = '\x85';
var wsRegexChars = '[' + ws + '][' + ws + ']*';
var trimBeginRegexp = new RegExp('^' + wsRegexChars);
var trimEndRegexp = new RegExp(wsRegexChars + '$');
var hasTrimWhitespaceBug = typeof StringPrototype.trim === 'function' &&
(ws.trim() !== '' || nxtLine.trim() === '');
defineProperties(StringPrototype, {
// http://blog.stevenlevithan.com/archives/faster-trim-javascript
// http://perfectionkills.com/whitespace-deviations/
Expand Down
10 changes: 6 additions & 4 deletions tests/spec/s-string.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,18 @@ describe('String', function () {
'use strict';

describe('trim', function () {
var test = '\x09\x0A\x0B\x0C\x0D\x20\xA0\u1680\u180E\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200A\u202F\u205F\u3000\u2028\u2029\uFEFFHello, World!\x09\x0A\x0B\x0C\x0D\x20\xA0\u1680\u180E\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200A\u202F\u205F\u3000\u2028\u2029\uFEFF';
var wsp = '\t\n\v\f\r \xA0\u1680\u180E\u2000\u2001\u2002\u2003\u2004\u2005' +
'\u2006\u2007\u2008\u2009\u200A\u200B\u2028\u2029\u202F\u205F\u3000\uFEFF';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's leave this on one line - arbitrary line length limits are silly. it will show the diff better also.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this sentiment; the reason I had broken it up was that I thought the project had line-length limits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it used to, it doesn't anymore :-D

var test = [wsp, 'Hello, World!', wsp].join('');

it('trims all ES5 whitespace', function () {
expect(test.trim()).toEqual('Hello, World!');
expect(test.trim().length).toBe(13);
});

it('does not trim the zero-width space', function () {
expect('\u200b'.trim()).toBe('\u200b');
expect('\u200b'.trim().length).toBe(1);
it('does not trim the next-line character', function () {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you link me to which part of the spec talks about this character?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This character is in the Unicode category Cc (Other, Control). See http://www.fileformat.info/info/unicode/char/0085/index.htm. The spec doesn't say anything about this category or this character.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http://es5.github.io/#x7.2

ECMAScript implementations must recognize all of the white space characters defined in Unicode 3.0. Later editions of the Unicode Standard may define other white space characters. ECMAScript implementations may recognize white space characters from later editions of the Unicode Standard.

I once read through several editions of the Unicode Standard and found that U+0085 (NEL) has been in Cc ever since 1.1.5 at least, and in particular in 3.0 and all later versions, so although its purpose was a single-character replacement for \r\n, it's non-trimmable; also, as for actively trimming U+200B (ZWSP), that was in Zs before Unicode 4.0.1 (upon and after which it was in Cf), so ES5 must recognize it as trimmable (and ES6 must not): paulmillr/es6-shim#354 (comment)

(When I made that comment, I hadn't realized that even if an ES5 or ES6 implementation targets a later version of Unicode than the earliest permissible one, it must still recognize as whitespace all Zs characters from the earliest permissible version of Unicode, respectively 3.0 and 5.1.0.)

expect('\x85'.trim()).toBe('\x85');
expect('\x85'.trim().length).toBe(1);
});
});

Expand Down