WebAssembly · lygstate · Dec 30, 2025 · Dec 30, 2025 · lygstate · Dec 30, 2025
diff --git a/proposals/stringref/Overview.md b/proposals/stringref/Overview.md
@@ -202,11 +202,11 @@ value reaches any instruction in this proposal.  The one exception is
 ### Creating strings
 
 ```
-(string.new_utf8 $memory ptr:address bytes:i32)
+(string.decode_from_utf8 $memory ptr:address bytes:i32)
   -> str:stringref
-(string.new_lossy_utf8 $memory ptr:address bytes:i32)
+(string.decode_from_lossy_utf8 $memory ptr:address bytes:i32)
   -> str:stringref
-(string.new_wtf8 $memory ptr:address bytes:i32)
+(string.decode_from_wtf8 $memory ptr:address bytes:i32)
   -> str:stringref
 ```
 Create a new string from the *`bytes`* bytes in memory at *`ptr`*.
@@ -215,22 +215,22 @@ Out-of-bounds access will trap.  The maximum value for *`bytes`* is
 
 These three instructions decode the bytes in three different ways:
 
- * `string.new_utf8` decodes using a strict UTF-8 decoder.  If the
+ * `string.decode_from_utf8` decodes using a strict UTF-8 decoder.  If the
     bytes are not valid UTF-8, trap.
 
- * `string.new_lossy_utf8` decodes using a sloppy UTF-8 decoder: all
+ * `string.decode_from_lossy_utf8` decodes using a sloppy UTF-8 decoder: all
    maximal subparts of an invalid subsequence are decoded as if they
    were `U+FFFD` (the replacement character) instead.  This instruction
    will never trap due to a decoding error.  See the section entitled
    "U+FFFD Substitution of Maximal Subparts" in the Unicode standard,
    version 14.0.0, page 126.
 
- * `string.new_wtf8` decodes using a strict WTF-8 decoder, which is like
+ * `string.decode_from_wtf8` decodes using a strict WTF-8 decoder, which is like
    UTF-8 but also allows isolated surrogates.  If the bytes are not
    valid WTF-8, trap.
 
 ```
-(string.new_wtf16 $memory ptr:address codeunits:i32)
+(string.decode_from_wtf16 $memory ptr:address codeunits:i32)
   -> str:stringref
 ```
 Create a new string from the *`codeunits`* code units encoded in memory at
@@ -240,14 +240,14 @@ is 2<sup>30</sup>–1; passing a higher value traps.  Each code unit is
 read from memory as if with `i32.load16`, and is therefore decoded
 using little-endian byte order.
 
-#### `string.new` size limits
+#### `string.decode_from_*` size limits
 
 Creating a string is a form of dynamic allocation and can fail.  The
 same implementation running on different machines can have different
 behaviors.  The specification can only say that byte/code-unit sizes
 above a certain limit *must* fail; but for sizes within the limits, the
 allocations *may* fail.  If an allocation fails, the implementation must
-trap.  Fallible `string.new` is a possible future extension.
+trap.  Fallible `string.decode_from_*` is a possible future extension.
 
 ### String literals
 
@@ -281,7 +281,7 @@ string literal section as a future extension.
 
 The maximum size for the WTF-8 encoding of an individual string literal
 is 2<sup>31</sup>–1 bytes.  Embeddings may impose their own limits which
-are more restricted.  But similarly to `string.new_wtf8`, instantiating
+are more restricted.  But similarly to `string.decode_from_wtf8`, instantiating
 a module with string literals may fail due to lack of memory resources,
 even if the string size is formally within the limits.  However
 `string.const` itself never traps when passed a valid literal offset.
@@ -331,7 +331,7 @@ is 2<sup>30</sup>-1.  If an encoding would require more code units than
 the limit, the result is -1.
 
 ```
-(string.encode_utf8 $memory str:stringref ptr:address)
+(string.encode_to_utf8 $memory str:stringref ptr:address)
   -> codeunits:i32
 ```
 Encode the contents of the string *`str`* as UTF-8 to memory at *ptr*.
@@ -340,11 +340,11 @@ written, which will be the same as returned by the corresponding
 `string.measure_utf8`.
 
 The maximum number of bytes that can be encoded at once by
-`string.encode` is 2<sup>31</sup>-1.  If an encoding would require more
+`string.encode_to_utf8` is 2<sup>31</sup>-1.  If an encoding would require more
 bytes, it is as if the codepoints can't be encoded (a trap).
 
 ```
-(string.encode_lossy_utf8 $memory str:stringref ptr:address)
+(string.encode_to_lossy_utf8 $memory str:stringref ptr:address)
   -> codeunits:i32
 ```
 Encode the contents of the string *`str`* as UTF-8 to memory at *`ptr`*.
@@ -353,23 +353,23 @@ character) instead.  Return the number of code units written, which will
 be the same as returned by the corresponding `string.measure_wtf8`.
 
 The maximum number of bytes that can be encoded at once by
-`string.encode` is 2<sup>31</sup>-1.  If an encoding would require more
+`string.encode_to_lossy_utf8` is 2<sup>31</sup>-1.  If an encoding would require more
 bytes, it is as if the codepoints can't be encoded (a trap).
 
 ```
-(string.encode_wtf8 $memory str:stringref ptr:address)
+(string.encode_to_wtf8 $memory str:stringref ptr:address)
   -> codeunits:i32
 ```
 Encode the contents of the string *`str`* as WTF-8 to memory at *`ptr`*.
 Return the number of code units written, which will be the same as
 returned by the corresponding `string.measure_wtf8`.
 
 The maximum number of bytes that can be encoded at once by
-`string.encode` is 2<sup>31</sup>-1.  If an encoding would require more
+`string.encode_to_wtf8` is 2<sup>31</sup>-1.  If an encoding would require more
 bytes, it is as if the codepoints can't be encoded (a trap).
 
 ```
-(string.encode_wtf16 $memory str:stringref ptr:address)
+(string.encode_to_wtf16 $memory str:stringref ptr:address)
   -> codeunits:i32
 ```
 Encode the contents of the string *`str`* as WTF-16 to memory at
@@ -380,7 +380,7 @@ Each code unit is written to memory as if stored by `i32.store16`, so
 WTF-16 code units are in little-endian byte order.
 
 The maximum number of bytes that can be encoded at once by
-`string.encode` is 2<sup>31</sup>-1.  If an encoding would require more
+`string.encode_to_wtf16` is 2<sup>31</sup>-1.  If an encoding would require more
 bytes, it is as if the codepoints can't be encoded (a trap).
 
 ### Concatenation
@@ -603,26 +603,26 @@ The instructions below shall be available in WebAssembly implementations
 that support both GC and stringrefs.
 
 ```
-(string.new_utf8_array codeunits:$t start:i32 end:i32)
+(string.decode_from_utf8_array codeunits:$t start:i32 end:i32)
   if expand($t) => array i8
   -> str:stringref
-(string.new_lossy_utf8_array codeunits:$t start:i32 end:i32)
+(string.decode_from_lossy_utf8_array codeunits:$t start:i32 end:i32)
   if expand($t) => array i8
   -> str:stringref
-(string.new_wtf8_array codeunits:$t start:i32 end:i32)
+(string.decode_from_wtf8_array codeunits:$t start:i32 end:i32)
   if expand($t) => array i8
   -> str:stringref
 ```
 Create a new string from a subsequence of the *`codeunits`* bytes in a
 GC-managed array, starting from offset *`start`* and continuing to but
 not including *`end`*.  If *`end`* is less than *`start`* or is greater
 than the array length, trap.  The bytes are decoded in the same way as
-`string.new_utf8`, `string.new_lossy_utf8`, and `string.new_wtf8`,
+`string.decode_from_utf8`, `string.decode_from_lossy_utf8`, and `string.decode_from_wtf8`,
 respectively.  The maximum value for *`end`*–*`start`* is
 2<sup>31</sup>–1; passing a higher value traps.
 
 ```
-(string.new_wtf16_array codeunits:$t start:i32 end:i32)
+(string.decode_from_wtf16_array codeunits:$t start:i32 end:i32)
   if expand($t) => array i16
   -> str:stringref
 ```
@@ -634,16 +634,16 @@ for *`end`*–*`start`* is 2<sup>30</sup>–1; passing a higher value
 traps.
 
 ```
-(string.encode_utf8_array str:stringref array:$t start:i32)
+(string.encode_to_utf8_array str:stringref array:$t start:i32)
   if expand($t) => array (mut i8)
   -> codeunits:i32
-(string.encode_lossy_utf8_array str:stringref array:$t start:i32)
+(string.encode_to_lossy_utf8_array str:stringref array:$t start:i32)
   if expand($t) => array (mut i8)
   -> codeunits:i32
-(string.encode_wtf8_array str:stringref array:$t start:i32)
+(string.encode_to_wtf8_array str:stringref array:$t start:i32)
   if expand($t) => array (mut i8)
   -> codeunits:i32
-(string.encode_wtf16_array str:stringref array:$t start:i32)
+(string.encode_to_wtf16_array str:stringref array:$t start:i32)
   if expand($t) => array (mut i16)
   -> codeunits:i32
 ```
@@ -655,8 +655,8 @@ same as the result of a the corresponding `string.measure_wtf8` or
 code units in the array, trap.  Note that no `NUL` terminator is ever
 written.
 
-For `string.encode_utf8_array`, trap if an isolated surrogate is seen.
-For `string.encode_lossy_utf8_array`, replace isolated surrogates with
+For `string.encode_to_utf8_array`, trap if an isolated surrogate is seen.
+For `string.encode_to_lossy_utf8_array`, replace isolated surrogates with
 `U+FFFD`.
 
 ## Binary encoding
@@ -669,21 +669,21 @@ reftype ::= ...
          |  0x61 ⇒ stringview_iter   ; SLEB128(-0x1f)
 
 instr ::= ...
-       |  0xfb 0x80:u32 $mem:u32       ⇒ string.new_utf8 $mem
-       |  0xfb 0x81:u32 $mem:u32       ⇒ string.new_wtf16 $mem
+       |  0xfb 0x80:u32 $mem:u32       ⇒ string.decode_from_utf8 $mem
+       |  0xfb 0x81:u32 $mem:u32       ⇒ string.decode_from_wtf16 $mem
        |  0xfb 0x82:u32 $idx:u32       ⇒ string.const $idx
        |  0xfb 0x83:u32                ⇒ string.measure_utf8
        |  0xfb 0x84:u32                ⇒ string.measure_wtf8
        |  0xfb 0x85:u32                ⇒ string.measure_wtf16
-       |  0xfb 0x86:u32 $mem:u32       ⇒ string.encode_utf8 $mem
-       |  0xfb 0x87:u32 $mem:u32       ⇒ string.encode_wtf16 $mem
+       |  0xfb 0x86:u32 $mem:u32       ⇒ string.encode_to_utf8 $mem
+       |  0xfb 0x87:u32 $mem:u32       ⇒ string.encode_to_wtf16 $mem
        |  0xfb 0x88:u32                ⇒ string.concat
        |  0xfb 0x89:u32                ⇒ string.eq
        |  0xfb 0x8a:u32                ⇒ string.is_usv_sequence
-       |  0xfb 0x8b:u32 $mem:u32       ⇒ string.new_lossy_utf8 $mem
-       |  0xfb 0x8c:u32 $mem:u32       ⇒ string.new_wtf8 $mem
-       |  0xfb 0x8d:u32 $mem:u32       ⇒ string.encode_lossy_utf8 $mem
-       |  0xfb 0x8e:u32 $mem:u32       ⇒ string.encode_wtf8 $mem
+       |  0xfb 0x8b:u32 $mem:u32       ⇒ string.decode_from_lossy_utf8 $mem
+       |  0xfb 0x8c:u32 $mem:u32       ⇒ string.decode_from_wtf8 $mem
+       |  0xfb 0x8d:u32 $mem:u32       ⇒ string.encode_to_lossy_utf8 $mem
+       |  0xfb 0x8e:u32 $mem:u32       ⇒ string.encode_to_wtf8 $mem
        |  0xfb 0x90:u32                ⇒ string.as_wtf8
        |  0xfb 0x91:u32                ⇒ stringview_wtf8.advance
        |  0xfb 0x92:u32 $mem:u32       ⇒ stringview_wtf8.encode_utf8 $mem
@@ -700,14 +700,14 @@ instr ::= ...
        |  0xfb 0xa2:u32                ⇒ stringview_iter.advance
        |  0xfb 0xa3:u32                ⇒ stringview_iter.rewind
        |  0xfb 0xa4:u32                ⇒ stringview_iter.slice
-       |  0xfb 0xb0:u32           [gc] ⇒ string.new_utf8_array
-       |  0xfb 0xb1:u32           [gc] ⇒ string.new_wtf16_array
-       |  0xfb 0xb2:u32           [gc] ⇒ string.encode_utf8_array
-       |  0xfb 0xb3:u32           [gc] ⇒ string.encode_wtf16_array
-       |  0xfb 0xb4:u32           [gc] ⇒ string.new_lossy_utf8_array
-       |  0xfb 0xb5:u32           [gc] ⇒ string.new_wtf8_array
-       |  0xfb 0xb6:u32           [gc] ⇒ string.encode_lossy_utf8_array
-       |  0xfb 0xb7:u32           [gc] ⇒ string.encode_wtf8_array
+       |  0xfb 0xb0:u32           [gc] ⇒ string.decode_from_utf8_array
+       |  0xfb 0xb1:u32           [gc] ⇒ string.decode_from_wtf16_array
+       |  0xfb 0xb2:u32           [gc] ⇒ string.encode_to_utf8_array
+       |  0xfb 0xb3:u32           [gc] ⇒ string.encode_to_wtf16_array
+       |  0xfb 0xb4:u32           [gc] ⇒ string.decode_from_lossy_utf8_array
+       |  0xfb 0xb5:u32           [gc] ⇒ string.decode_from_wtf8_array
+       |  0xfb 0xb6:u32           [gc] ⇒ string.encode_to_lossy_utf8_array
+       |  0xfb 0xb7:u32           [gc] ⇒ string.encode_to_wtf8_array
 
 ;; New section.  If present, must be present only once, and right before
 ;; the globals section (or where the globals section would be).  Each
@@ -733,11 +733,11 @@ operand allows you to elide the memory, in which case it defaults to 0.
   local.get $ptr
   local.get $ptr
   call $strlen
-  string.new_utf8)
+  string.decode_from_utf8)
 ```
 
 If the bytes being decoded aren't actually valid UTF-8, this function
-will trap.  Use `string.new_lossy_utf8` in contexts where replacing
+will trap.  Use `string.decode_from_lossy_utf8` in contexts where replacing
 invalid data with `U+FFFD` is a better strategy than trapping.
 
 ### Make string from an array of WTF-8 code units in memory
@@ -746,20 +746,20 @@ invalid data with `U+FFFD` is a better strategy than trapping.
 (func $string-from-wtf8n (param $ptr i32) (param $len i32) (result stringref)
   local.get $ptr
   local.get $len
-  string.new_wtf8)
+  string.decode_from_wtf8)
 ```
 
-Note that `string.new_wtf8` (and `string.new_wtf8_array`) are always
+Note that `string.decode_from_wtf8` (and `string.decode_from_wtf8_array`) are always
 strict decoders: if the bytes are not valid WTF-8, the instruction
 traps.
 
-### Make string from UTF-16 in memory
+### Make string from WTF-16 in memory
 
 ```wasm
-(func $string-from-utf16 (param $ptr i32) (param $units i32) (result stringref)
+(func $string-from-wtf16n (param $ptr i32) (param $units i32) (result stringref)
   local.get $ptr
   local.get $units
-  string.new_wtf16)
+  string.decode_from_wtf16)
 ```
 
 This proposal doesn't distinguish between UTF-16 and WTF-16 at all;
@@ -971,7 +971,7 @@ open to considering adding more instructions.
 
   local.get $str
   local.get $ptr
-  string.encode_utf8        ;; push bytes written, same as $len
+  string.encode_to_utf8        ;; push bytes written, same as $len
 
   local.get $ptr
   i32.add
@@ -986,8 +986,8 @@ Using `string.measure_utf8` ensures that the encoded string is a valid
 unicode scalar value sequence.  How to handle invalid UTF-8 is up to the
 user; instead of `unreachable` we could throw an exception.
 
-Note that in this case, the subsequent `string.encode_utf8` could just
-as well have been `string.encode_lossy_utf8` or `string.encode_wtf8`, as
+Note that in this case, the subsequent `string.encode_to_utf8` could just
+as well have been `string.encode_to_lossy_utf8` or `string.encode_to_wtf8`, as
 these instructions are all the same for strings that do not contain
 isolated surrogates, and we checked that there were none.
 
@@ -1012,7 +1012,7 @@ will encode isolated surrogates as WTF-8.
     local.get $cursor
     global.get $buf
     i32.const 1024
-    string.encode_wtf8               ;; push bytes written
+    string.encode_to_wtf8               ;; push bytes written
     local.tee $bytes
     (if i32.eqz (then return))       ;; if no bytes encoded, done
     local.get $bytes
@@ -1445,7 +1445,7 @@ faster than `externref`+imports:
     predictable performance than e.g. an encoder implemented in JS (for
     web embeddings).
  4. Reading string contents, either via
-    `string.encode_wtf8`-then-process-inline or via `stringview_wtf16`,
+    `string.encode_to_wtf8`-then-process-inline or via `stringview_wtf16`,
     is likely faster than calling out to JavaScript to read code units
     one at a time.  WebAssembly-to-JavaScript calls are cheap but not
     free.
@@ -1506,8 +1506,8 @@ concrete adapter function specialized to the data representations used
 by the caller and the callee.  The instruction set in this proposal can
 be used to implement the adapter function for passing a `stringref` as a
 string; assuming that the adapter function is generated in such a way
-that it has access to the target memory, `string.encode_wtf8` can
-implement the copy and validation at the same time.  `string.new_wtf8`
+that it has access to the target memory, `string.encode_to_wtf8` can
+implement the copy and validation at the same time.  `string.decode_from_wtf8`
 would be the implementation of getting a `stringref` from an
 interface-typed string value, again assuming UTF-8 encoding for these
 values.