Skip to content

iso8859-1 vs windows-1252 #25851

Closed
Closed
@hashseed

Description

@hashseed

This is somewhat related to #13722, but not quite.

Wikipedia contains the gist:

It is very common to mislabel Windows-1252 text with the charset label ISO-8859-1. [...] Most modern web browsers and e-mail clients treat the media type charset ISO-8859-1 as Windows-1252 to accommodate such mislabeling. This is now standard behavior in the HTML5 specification, which requires that documents advertised as ISO-8859-1 actually be parsed with the Windows-1252 encoding.

Chromium's ICU interprets "iso8859-1" to mean Windows-1252. Node.js does not. The WHATWG spec suggests Chromium's behavior to be correct.

The consequence of all of this is that when I build Node.js with Chromium's ICU, test/parallel/test-icu-transcode.js fails due to the character "€", which Windows-1252 includes, but ISO-8859-1 does not.

I propose:

  1. Change the test to pass for both interpretations.
  2. Conform to Chromium's behavior.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bufferIssues and PRs related to the buffer subsystem.i18n-apiIssues and PRs related to the i18n implementation.string_decoderIssues and PRs related to the string_decoder subsystem.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions