Description
I've discovered a serious issue in the latest Socket.IO (0.7.7). When certain unicode (6.0) characters are sent through the socket, Socket.IO's internal parser will fail and the event will never arrive properly to the server. I've tested it on the websocket transport in Chrome, but it seems to also fail on other transports.
Here is a piece of example code that recreates it:
var socket = io.connect('http://localhost/endpoint');
socket.emit('any-event', {messages: [{"text":"@nov Nice presentation \ud83d\udc4d"}]}, function(err, ) {
// this will never return and the above data will never ariive
});
The character in question here is the unicode "thumbs up sign" http://www.fileformat.info/info/unicode/char/1f44d/index.htm - However, I've seen this fail on other pieces of unicode too! This is a serious issue guys.
More details on the situation:
On the client-side, the last 3 characters of the JSON string are as follows when being sent on the websocket (https://github.com/LearnBoost/socket.io-client/blob/master/dist/socket.io.js#L2322 ):
32 // the space after the word "presentation"
55357
56397
However, on the server-side (https://github.com/LearnBoost/socket.io/blob/master/lib/transports/websocket.js#L132) this arrives as follows:
32 // the space after the word "presentation"
65533
Please note that 65533 is the same as '\ufffd', the special character that's used for parsing.