Description
Treating Unicode strings as arrays often leads to bugs where code processes text in some languages correctly but not others. In JavaScript, it's surprising that "🤦🏼♂️".length == 7
, and the advice to programmers often is: you usually don't want to look at .length
, because it isn't reliably what end users think of as characters, it isn't reliably the number of codepoints, and it isn't reliably related to the display width of the string.
Similarly, functions names borrowed from JS using the term "Char", such as fromCharCode
, are confusing to programmers coming from non-JS languages, since code units aren't always characters.
So, what if AssemblyScript moved functions which work in terms of the underlying code-unit concept, such as charCodeAt
, into a String.JS
namespace, similar to the String.UTF16
namespace? They'd all be available, and easily accessible. But, they'd be visually distinguished from the other string functions, making it clear where code-unit assumptions are being made. It would also leave more conceptual room in the base String
namespace for new features in the future.
Another effect of the name String.JS
could be to signal to programmers that these functions won't necessarily always be optimal or natural in non-JS embeddings of Wasm, which may give AssemblyScript as a language more implementation flexibility in non-JS environments.
All that said, I don't know where AssemblyScript stands on standard library API stability at this time. If breaking changes are out of scope, perhaps some of the above goals could at least be advanced through documentation.