Skip to content

properly handle unicode characters #252

Closed
@rabbah

Description

The decoding is localized to reading records from the database (on the client side) which is normalizing to the HTTP default charset ISO-8859-1 instead of UTF-8.

As a sanity check, I confirmed that for a sample code:

function main() {
  var frenchMessage = "Le périphérique est connecté";
  var chineseMesage = "装置被连接";
  return ({"fr": frenchMessage, "zh-Hans": chineseMesage});
}

The record is stored correctly in the database. If I were to encode the string using base64 encoding and logging the result, the activation result and logs are correctly stored in the database:

function main() {
  var msg = 'eyJmciI6IkxlIHDDqXJpcGjDqXJpcXVlIGVzdCBjb25uZWN0w6kiLCJ6aC1IYW5zIjoi6KOF572u6KKr6L+e5o6lIn0=';
  var dec = new Buffer(msg, 'base64').toString('utf-8');
  console.log(dec);
  var res = JSON.parse(dec);
  return res;
}

results in this activation record:

"response": {
    "statusCode": 0,
    "result": {
      "fr": "Le périphérique est connecté",
      "zh-Hans": "装置被连接"
    }
  },
  "logs": [
    "2016-04-23T13:36:16.030199912Z stdout: {\"fr\":\"Le périphérique est connecté\",\"zh-Hans\":\"装置被连接\"}"
  ],

@psuter FYI this is true both using the Cloudant SDK as well as the Spray client.

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions