@@ -69,8 +69,10 @@ bufnr([{buf} [, {create}]]) Number Number of the buffer {buf}
69
69
bufwinid({buf} ) Number window ID of buffer {buf}
70
70
bufwinnr({buf} ) Number window number of buffer {buf}
71
71
byte2line({byte} ) Number line number at byte count {byte}
72
- byteidx({expr} , {nr} ) Number byte index of {nr} th char in {expr}
73
- byteidxcomp({expr} , {nr} ) Number byte index of {nr} th char in {expr}
72
+ byteidx({expr} , {nr} [, {utf16} ])
73
+ Number byte index of {nr} th char in {expr}
74
+ byteidxcomp({expr} , {nr} [, {utf16} ])
75
+ Number byte index of {nr} th char in {expr}
74
76
call({func} , {arglist} [, {dict} ])
75
77
any call {func} with arguments {arglist}
76
78
ceil({expr} ) Float round {expr} up
@@ -80,7 +82,7 @@ chansend({id}, {data}) Number Writes {data} to channel
80
82
char2nr({expr} [, {utf8} ]) Number ASCII/UTF-8 value of first char in {expr}
81
83
charclass({string} ) Number character class of {string}
82
84
charcol({expr} [, {winid} ]) Number column number of cursor or mark
83
- charidx({string} , {idx} [, {countcc} ])
85
+ charidx({string} , {idx} [, {countcc} [, {utf16} ] ])
84
86
Number char index of byte {idx} in {string}
85
87
chdir({dir} ) String change current working directory
86
88
cindent({lnum} ) Number C indent for line {lnum}
@@ -501,6 +503,8 @@ strptime({format}, {timestring})
501
503
strridx({haystack} , {needle} [, {start} ])
502
504
Number last index of {needle} in {haystack}
503
505
strtrans({expr} ) String translate string to make it printable
506
+ strutf16len({string} [, {countcc} ])
507
+ Number number of UTF-16 code units in {string}
504
508
strwidth({expr} ) Number display cell length of the String {expr}
505
509
submatch({nr} [, {list} ]) String or List
506
510
specific match in ":s" or substitute()
@@ -545,6 +549,8 @@ undofile({name}) String undo file name for {name}
545
549
undotree() List undo file tree
546
550
uniq({list} [, {func} [, {dict} ]])
547
551
List remove adjacent duplicates from a list
552
+ utf16idx({string} , {idx} [, {countcc} [, {charidx} ]])
553
+ Number UTF-16 index of byte {idx} in {string}
548
554
values({dict} ) List values in {dict}
549
555
virtcol({expr} [, {list} ]) Number or List
550
556
screen column of cursor or mark
@@ -982,7 +988,7 @@ byte2line({byte}) *byte2line()*
982
988
Can also be used as a | method | : >
983
989
GetOffset()->byte2line()
984
990
985
- byteidx({expr} , {nr} ) *byteidx()*
991
+ byteidx({expr} , {nr} [, {utf16} ]) *byteidx()*
986
992
Return byte index of the {nr} th character in the String
987
993
{expr} . Use zero for the first character, it then returns
988
994
zero.
@@ -992,6 +998,13 @@ byteidx({expr}, {nr}) *byteidx()*
992
998
length is added to the preceding base character. See
993
999
| byteidxcomp() | below for counting composing characters
994
1000
separately.
1001
+ When {utf16} is present and TRUE, {nr} is used as the UTF-16
1002
+ index in the String {expr} instead of as the character index.
1003
+ The UTF-16 index is the index in the string when it is encoded
1004
+ with 16-bit words. If the specified UTF-16 index is in the
1005
+ middle of a character (e.g. in a 4-byte character), then the
1006
+ byte index of the first byte in the character is returned.
1007
+ Refer to | string-offset-encoding | for more information.
995
1008
Example : >
996
1009
echo matchstr(str, ".", byteidx(str, 3))
997
1010
< will display the fourth character. Another way to do the
@@ -1003,11 +1016,17 @@ byteidx({expr}, {nr}) *byteidx()*
1003
1016
If there are less than {nr} characters -1 is returned.
1004
1017
If there are exactly {nr} characters the length of the string
1005
1018
in bytes is returned.
1006
-
1019
+ See | charidx() | and | utf16idx() | for getting the character and
1020
+ UTF-16 index respectively from the byte index.
1021
+ Examples: >
1022
+ echo byteidx('a😊😊', 2) returns 5
1023
+ echo byteidx('a😊😊', 2, 1) returns 1
1024
+ echo byteidx('a😊😊', 3, 1) returns 5
1025
+ <
1007
1026
Can also be used as a | method | : >
1008
1027
GetName()->byteidx(idx)
1009
1028
1010
- byteidxcomp({expr} , {nr} ) *byteidxcomp()*
1029
+ byteidxcomp({expr} , {nr} [, {utf16} ]) *byteidxcomp()*
1011
1030
Like byteidx(), except that a composing character is counted
1012
1031
as a separate character. Example: >
1013
1032
let s = 'e' .. nr2char(0x301)
@@ -1131,27 +1150,36 @@ charcol({expr} [, {winid}]) *charcol()*
1131
1150
GetPos()->col()
1132
1151
<
1133
1152
*charidx()*
1134
- charidx({string} , {idx} [, {countcc} ])
1153
+ charidx({string} , {idx} [, {countcc} [, {utf16} ] ])
1135
1154
Return the character index of the byte at {idx} in {string} .
1136
1155
The index of the first character is zero.
1137
1156
If there are no multibyte characters the returned value is
1138
1157
equal to {idx} .
1158
+
1139
1159
When {countcc} is omitted or | FALSE | , then composing characters
1140
- are not counted separately, their byte length is
1141
- added to the preceding base character.
1160
+ are not counted separately, their byte length is added to the
1161
+ preceding base character.
1142
1162
When {countcc} is | TRUE | , then composing characters are
1143
1163
counted as separate characters.
1164
+
1165
+ When {utf16} is present and TRUE, {idx} is used as the UTF-16
1166
+ index in the String {expr} instead of as the byte index.
1167
+
1144
1168
Returns -1 if the arguments are invalid or if {idx} is greater
1145
1169
than the index of the last byte in {string} . An error is
1146
1170
given if the first argument is not a string, the second
1147
1171
argument is not a number or when the third argument is present
1148
1172
and is not zero or one.
1173
+
1149
1174
See | byteidx() | and | byteidxcomp() | for getting the byte index
1150
- from the character index.
1175
+ from the character index and | utf16idx() | for getting the
1176
+ UTF-16 index from the character index.
1177
+ Refer to | string-offset-encoding | for more information.
1151
1178
Examples: >
1152
1179
echo charidx('áb́ć', 3) returns 1
1153
1180
echo charidx('áb́ć', 6, 1) returns 4
1154
1181
echo charidx('áb́ć', 16) returns -1
1182
+ echo charidx('a😊😊', 4, 0, 1) returns 2
1155
1183
<
1156
1184
Can also be used as a | method | : >
1157
1185
GetName()->charidx(idx)
@@ -8332,6 +8360,28 @@ strtrans({string}) *strtrans()*
8332
8360
Can also be used as a | method | : >
8333
8361
GetString()->strtrans()
8334
8362
8363
+ strutf16len({string} [, {countcc} ]) *strutf16len()*
8364
+ The result is a Number, which is the number of UTF-16 code
8365
+ units in String {string} (after converting it to UTF-16).
8366
+
8367
+ When {countcc} is TRUE, composing characters are counted
8368
+ separately.
8369
+ When {countcc} is omitted or FALSE, composing characters are
8370
+ ignored.
8371
+
8372
+ Returns zero on error.
8373
+
8374
+ Also see | strlen() | and | strcharlen() | .
8375
+ Examples: >
8376
+ echo strutf16len('a') returns 1
8377
+ echo strutf16len('©') returns 1
8378
+ echo strutf16len('😊') returns 2
8379
+ echo strutf16len('ą́') returns 1
8380
+ echo strutf16len('ą́', v:true) returns 3
8381
+
8382
+ Can also be used as a |method|: >
8383
+ GetText()->strutf16len()
8384
+ <
8335
8385
strwidth({string} ) *strwidth()*
8336
8386
The result is a Number, which is the number of display cells
8337
8387
String {string} occupies. A Tab character is counted as one
@@ -9063,6 +9113,34 @@ uniq({list} [, {func} [, {dict}]]) *uniq()* *E882*
9063
9113
9064
9114
Can also be used as a | method | : >
9065
9115
mylist->uniq()
9116
+ <
9117
+ *utf16idx()*
9118
+ utf16idx({string} , {idx} [, {countcc} [, {charidx} ]])
9119
+ Same as | charidx() | but returns the UTF-16 index of the byte
9120
+ at {idx} in {string} (after converting it to UTF-16).
9121
+
9122
+ When {charidx} is present and TRUE, {idx} is used as the
9123
+ character index in the String {string} instead of as the byte
9124
+ index.
9125
+ An {idx} in the middle of a UTF-8 sequence is rounded upwards
9126
+ to the end of that sequence.
9127
+
9128
+ See | byteidx() | and | byteidxcomp() | for getting the byte index
9129
+ from the UTF-16 index and | charidx() | for getting the
9130
+ character index from the UTF-16 index.
9131
+ Refer to | string-offset-encoding | for more information.
9132
+ Examples: >
9133
+ echo utf16idx('a😊😊', 3) returns 2
9134
+ echo utf16idx('a😊😊', 7) returns 4
9135
+ echo utf16idx('a😊😊', 1, 0, 1) returns 2
9136
+ echo utf16idx('a😊😊', 2, 0, 1) returns 4
9137
+ echo utf16idx('aą́c', 6) returns 2
9138
+ echo utf16idx('aą́c', 6, 1) returns 4
9139
+ echo utf16idx('a😊😊', 9) returns -1
9140
+ <
9141
+ Can also be used as a | method | : >
9142
+ GetName()->utf16idx(idx)
9143
+
9066
9144
9067
9145
values({dict} ) *values()*
9068
9146
Return a | List | with all the values of {dict} . The | List | is
0 commit comments