Commit d93ddbe
committed
automata: add internal HalfMatch APIs for NFA engines
Welp, okay, turns out we do need to know at least the end offset of a
match even when the NFA has no capture states. This is necessary for
correctly handling the case where a regex can match the empty string but
the caller has asked that matches not split a codepoint. If we don't
know the end offset of a match, then we can't correctly determine
whether a match exists or not and are forced to return no match even
when a match exists. We can get away with this I think for `find`-style
APIs where the caller has specifically requested match offsets while
simultaneously configuring the NFA to not track offsets, but with
`is_match`-style APIs, we really should be able to handle it correctly.
We should eventually just expose the `HalfMatch` APIs on `PikeVM` and
`BoundedBacktracker`, but for now we keep them private.1 parent e003cae commit d93ddbe
2 files changed
+49
-70
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
| 22 | + | |
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| |||
1295 | 1295 | | |
1296 | 1296 | | |
1297 | 1297 | | |
1298 | | - | |
| 1298 | + | |
| 1299 | + | |
1299 | 1300 | | |
1300 | 1301 | | |
1301 | 1302 | | |
1302 | 1303 | | |
1303 | | - | |
| 1304 | + | |
| 1305 | + | |
1304 | 1306 | | |
1305 | 1307 | | |
1306 | 1308 | | |
1307 | 1309 | | |
1308 | 1310 | | |
1309 | 1311 | | |
1310 | 1312 | | |
1311 | | - | |
| 1313 | + | |
1312 | 1314 | | |
1313 | 1315 | | |
1314 | 1316 | | |
1315 | 1317 | | |
1316 | 1318 | | |
1317 | 1319 | | |
1318 | | - | |
| 1320 | + | |
1319 | 1321 | | |
1320 | 1322 | | |
1321 | 1323 | | |
| |||
1328 | 1330 | | |
1329 | 1331 | | |
1330 | 1332 | | |
1331 | | - | |
| 1333 | + | |
1332 | 1334 | | |
1333 | | - | |
| 1335 | + | |
1334 | 1336 | | |
1335 | | - | |
1336 | | - | |
1337 | | - | |
1338 | | - | |
1339 | | - | |
1340 | | - | |
1341 | | - | |
1342 | | - | |
1343 | | - | |
1344 | | - | |
1345 | | - | |
| 1337 | + | |
| 1338 | + | |
1346 | 1339 | | |
1347 | | - | |
1348 | | - | |
1349 | | - | |
1350 | | - | |
1351 | | - | |
1352 | | - | |
1353 | | - | |
1354 | | - | |
| 1340 | + | |
| 1341 | + | |
| 1342 | + | |
| 1343 | + | |
1355 | 1344 | | |
1356 | 1345 | | |
1357 | 1346 | | |
| |||
1367 | 1356 | | |
1368 | 1357 | | |
1369 | 1358 | | |
1370 | | - | |
| 1359 | + | |
1371 | 1360 | | |
1372 | 1361 | | |
1373 | 1362 | | |
| |||
1414 | 1403 | | |
1415 | 1404 | | |
1416 | 1405 | | |
1417 | | - | |
1418 | | - | |
| 1406 | + | |
1419 | 1407 | | |
1420 | | - | |
| 1408 | + | |
1421 | 1409 | | |
1422 | 1410 | | |
1423 | 1411 | | |
| |||
1438 | 1426 | | |
1439 | 1427 | | |
1440 | 1428 | | |
1441 | | - | |
| 1429 | + | |
1442 | 1430 | | |
1443 | 1431 | | |
1444 | 1432 | | |
1445 | 1433 | | |
1446 | | - | |
1447 | | - | |
1448 | | - | |
| 1434 | + | |
| 1435 | + | |
1449 | 1436 | | |
1450 | 1437 | | |
1451 | 1438 | | |
| |||
1475 | 1462 | | |
1476 | 1463 | | |
1477 | 1464 | | |
1478 | | - | |
| 1465 | + | |
1479 | 1466 | | |
1480 | 1467 | | |
1481 | 1468 | | |
| |||
1558 | 1545 | | |
1559 | 1546 | | |
1560 | 1547 | | |
1561 | | - | |
| 1548 | + | |
1562 | 1549 | | |
1563 | 1550 | | |
1564 | 1551 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | | - | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
21 | 23 | | |
22 | 24 | | |
23 | 25 | | |
| |||
1094 | 1096 | | |
1095 | 1097 | | |
1096 | 1098 | | |
1097 | | - | |
| 1099 | + | |
| 1100 | + | |
1098 | 1101 | | |
1099 | 1102 | | |
1100 | 1103 | | |
| |||
1109 | 1112 | | |
1110 | 1113 | | |
1111 | 1114 | | |
1112 | | - | |
| 1115 | + | |
| 1116 | + | |
1113 | 1117 | | |
1114 | 1118 | | |
1115 | 1119 | | |
1116 | 1120 | | |
1117 | 1121 | | |
1118 | 1122 | | |
1119 | 1123 | | |
1120 | | - | |
| 1124 | + | |
1121 | 1125 | | |
1122 | 1126 | | |
1123 | 1127 | | |
1124 | 1128 | | |
1125 | 1129 | | |
1126 | 1130 | | |
1127 | | - | |
| 1131 | + | |
1128 | 1132 | | |
1129 | 1133 | | |
1130 | 1134 | | |
| |||
1137 | 1141 | | |
1138 | 1142 | | |
1139 | 1143 | | |
1140 | | - | |
| 1144 | + | |
1141 | 1145 | | |
1142 | | - | |
| 1146 | + | |
1143 | 1147 | | |
1144 | | - | |
1145 | | - | |
1146 | | - | |
1147 | | - | |
1148 | | - | |
1149 | | - | |
1150 | | - | |
1151 | | - | |
1152 | | - | |
1153 | | - | |
1154 | | - | |
| 1148 | + | |
| 1149 | + | |
1155 | 1150 | | |
1156 | | - | |
1157 | | - | |
1158 | | - | |
1159 | | - | |
1160 | | - | |
1161 | | - | |
1162 | | - | |
1163 | | - | |
| 1151 | + | |
| 1152 | + | |
| 1153 | + | |
| 1154 | + | |
1164 | 1155 | | |
1165 | 1156 | | |
1166 | 1157 | | |
| |||
1235 | 1226 | | |
1236 | 1227 | | |
1237 | 1228 | | |
1238 | | - | |
| 1229 | + | |
1239 | 1230 | | |
1240 | 1231 | | |
1241 | 1232 | | |
| |||
1264 | 1255 | | |
1265 | 1256 | | |
1266 | 1257 | | |
1267 | | - | |
| 1258 | + | |
1268 | 1259 | | |
1269 | 1260 | | |
1270 | 1261 | | |
| |||
1283 | 1274 | | |
1284 | 1275 | | |
1285 | 1276 | | |
1286 | | - | |
| 1277 | + | |
1287 | 1278 | | |
1288 | 1279 | | |
1289 | 1280 | | |
| |||
1353 | 1344 | | |
1354 | 1345 | | |
1355 | 1346 | | |
1356 | | - | |
| 1347 | + | |
1357 | 1348 | | |
1358 | 1349 | | |
1359 | 1350 | | |
| |||
1372 | 1363 | | |
1373 | 1364 | | |
1374 | 1365 | | |
1375 | | - | |
1376 | | - | |
| 1366 | + | |
| 1367 | + | |
| 1368 | + | |
1377 | 1369 | | |
1378 | 1370 | | |
1379 | 1371 | | |
1380 | 1372 | | |
1381 | 1373 | | |
1382 | | - | |
| 1374 | + | |
1383 | 1375 | | |
1384 | 1376 | | |
1385 | 1377 | | |
1386 | 1378 | | |
1387 | 1379 | | |
1388 | 1380 | | |
1389 | 1381 | | |
1390 | | - | |
| 1382 | + | |
1391 | 1383 | | |
1392 | 1384 | | |
1393 | 1385 | | |
| |||
0 commit comments