Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Unicode byte order mark documentation #1911

Merged
merged 1 commit into from
Sep 26, 2024

Conversation

aaronfranke
Copy link
Contributor

@aaronfranke aaronfranke commented Sep 11, 2024

Hello, I came across this article and noticed several problems with it. https://learn.microsoft.com/en-us/windows/win32/intl/using-byte-order-marks

  • The statement "Unicode plain text is a sequence of 16-bit code values" is incorrect. Unicode can be encoded in several encodings including UTF-8, UTF-16, and UTF-32. Unicode itself is a mapping of numbers to code points, many of which cannot fit into 16 bits.

  • The statement "Microsoft uses UTF-16, little endian byte order." is incorrect. Some legacy Microsoft products such as Visual Studio use Windows-1252 by default. Some legacy Microsoft products use the name "Unicode" to refer to UCS-2, which is similar to UTF-16 but is restricted to the Basic Multilingual Plane and is a fixed-width 16-bit encoding. Modern Microsoft products such as Visual Studio Code and .NET use UTF-8 by default, and over 98% of websites use UTF-8, so note that this is recommended for new applications.

  • The statement "which informs an application receiving the file that the file is byte-ordered" is nonsense. All bytes in a file are in some order, there is no such thing as a file with unordered bytes. The byte order mark is useful for UTF-16 and UTF-32 to indicate whether their byte order is little endian or big endian, not whether they are byte-ordered in general.

This PR attempts to fix these problems. If further tweaks are required to the text, let me know and I can update the PR.

Copy link
Contributor

@aaronfranke : Thanks for your contribution! The author(s) have been notified to review your proposed change.

@Karl-Bridge-Microsoft Karl-Bridge-Microsoft merged commit fe5d2b6 into MicrosoftDocs:docs Sep 26, 2024
1 check passed
@aaronfranke aaronfranke deleted the fix-bom-doc branch September 26, 2024 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants