-
Notifications
You must be signed in to change notification settings - Fork 193
fix(core): indexing problem in Regexish work #883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is somewhat critical, so I will be merging this relatively quickly. |
db0b836
to
c2c89f1
Compare
Thanks for pinging me! I think I've been seeing evidence of this today. I wonder why I didn't see it while I was working on it. Was something missing from my tests? From looking at your changes it was right in the place with I still don't really understand those bits of code well by the way if somebody can explain them better. |
@@ -424,13 +424,13 @@ mod tests { | |||
#[test] | |||
fn lexes_youtube_as_hostname() { | |||
let source: Vec<_> = "YouTube.com".chars().collect(); | |||
assert!(matches!( | |||
assert_eq!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hippietrail, the real issue was not the assert!(matches!
. It was the fact that you didn't check the output index. This is a common problem with asking LLMs to generate test code: they make stuff that's similar to what's already there rather than improving coverage in a way that matters.
The bigger issue comes from the fact that I didn't review your code as carefully as I should have. That's on me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's when we zoom out a tiny bit further that I had trouble grokking the syntax:
assert!(matches!(
assert_eq!(
lex_token(&source),
Some(FoundToken {
token: TokenKind::Regexish,
..
next_index: 3
})
));
);
There's some kind of struct destructuring along with optional desctructuring or something and that ..
which I mimicked from other code without really understanding it where it would've apparently been next_index: source.len()
I understand the logic where the size of the token needs to be added while doing the lexing. I've done lexing in a few languages over the years.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I understand. matches!
is a pattern-matching macro. It follows Rust's match
syntax.
Good resources:
This MR contains the following updates: | Package | Update | Change | |---|---|---| | [Automattic/harper/harper-ls](https://github.com/Automattic/harper) | minor | `v0.24.0` -> `v0.26.0` | MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot). **Proposed changes to behavior should be submitted there as MRs.** --- ### Release Notes <details> <summary>Automattic/harper (Automattic/harper/harper-ls)</summary> ### [`v0.26.0`](https://github.com/Automattic/harper/releases/tag/v0.26.0) [Compare Source](Automattic/harper@v0.25.1...v0.26.0) #### What's Changed - docs: fix user dictionary by [@​kit494way](https://github.com/kit494way) in Automattic/harper#893 - feat: mask out comments beginning with spellchecker:ignore by [@​grantlemons](https://github.com/grantlemons) in Automattic/harper#861 - feat(harper.js): export both binary and inlinedBinary for different runtimes by [@​Asuka109](https://github.com/Asuka109) in Automattic/harper#607 - feat: linter for "as far back as" to replace "as early back as" by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#889 - feat: flag "explanation mark/point" instead of "exclamation" by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#895 - feat: correct "in anyway" to "in any way" by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#894 - build(deps): bump [@​babel/helpers](https://github.com/babel/helpers) from 7.26.9 to 7.26.10 in /packages by [@​dependabot](https://github.com/dependabot) in Automattic/harper#899 - fix: two spelling mistakes based on homophones by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#886 - feat: allow blank lines and comments in `dictionary.dict` by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#756 - docs: fix typo [#​906](Automattic/harper#906) by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#912 - hotfix(core): properly store spans in `PatternLinter` cache by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#926 - Dictionary curation 2025 03 12 by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#902 - Dialect prototyping by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#925 - feat: insert newline automatically in `just addnoun` by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#931 - docs: fix 3 grammar mistakes by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#933 - feat: linter for "each and everyone" by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#923 - feat: expand the "get rid off" lint to cover "get ride of" by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#900 - fix(vscode-plugin): ignore non-existent ".git" files, support untitled/unsaved files on VS Code by [@​kiding](https://github.com/kiding) in Automattic/harper#927 - feat(core): improve assertion to allow overlapping suggestions by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#935 - build(deps): bump [@​wordpress/editor](https://github.com/wordpress/editor) from 14.19.0 to 14.20.0 in /packages by [@​dependabot](https://github.com/dependabot) in Automattic/harper#915 - build(deps): bump indexmap from 2.7.1 to 2.8.0 by [@​dependabot](https://github.com/dependabot) in Automattic/harper#921 - build(deps): bump tokio from 1.43.0 to 1.44.1 by [@​dependabot](https://github.com/dependabot) in Automattic/harper#919 - build(deps-dev): bump [@​types/node](https://github.com/types/node) from 22.13.9 to 22.13.10 in /packages by [@​dependabot](https://github.com/dependabot) in Automattic/harper#913 - build(deps): bump foldhash from 0.1.4 to 0.1.5 by [@​dependabot](https://github.com/dependabot) in Automattic/harper#917 - feat: correct "along time" to "a long time" by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#910 - Add -able affix to open (openable) by [@​claydugo](https://github.com/claydugo) in Automattic/harper#930 - docs: mention hidden library dependencies by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#943 - feat(core): create new test assertion for `nth` suggestion results by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#942 - build: migrate to pnpm workspace & biome by [@​Asuka109](https://github.com/Asuka109) in Automattic/harper#924 - build(deps): bump serde from 1.0.218 to 1.0.219 by [@​dependabot](https://github.com/dependabot) in Automattic/harper#920 - build(deps): bump clap from 4.5.31 to 4.5.32 by [@​dependabot](https://github.com/dependabot) in Automattic/harper#946 - Web improvements by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#944 - feat: ignore shebang lines by [@​holmanb](https://github.com/holmanb) in Automattic/harper#947 - feat(web): add mask-image to header by [@​Asuka109](https://github.com/Asuka109) in Automattic/harper#951 - fix(core): reduce ambiguity for `AvoidContraction` by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#941 - chore: add comments describing major sections by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#952 #### New Contributors - [@​kit494way](https://github.com/kit494way) made their first contribution in Automattic/harper#893 - [@​holmanb](https://github.com/holmanb) made their first contribution in Automattic/harper#947 **Full Changelog**: Automattic/harper@v0.25.1...v0.26.0 ### [`v0.25.1`](https://github.com/Automattic/harper/releases/tag/v0.25.1) [Compare Source](Automattic/harper@v0.25.0...v0.25.1) #### What's Changed - docs(ls): give example config that disables `sentence_capitalization` by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#879 - fix(core): indexing problem in Regexish work by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#883 - Just getforms improvements by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#862 - Dictionary curation 2025 03 11 by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#884 - fix(core): insert paragraph breaks after code blocks by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#882 **Full Changelog**: Automattic/harper@v0.25.0...v0.25.1 ### [`v0.25.0`](https://github.com/Automattic/harper/releases/tag/v0.25.0) [Compare Source](Automattic/harper@v0.24.0...v0.25.0) #### What's Changed - docs: update integrations section by [@​mcecode](https://github.com/mcecode) in Automattic/harper#755 - Typst Corrections by [@​grantlemons](https://github.com/grantlemons) in Automattic/harper#442 - refactor: add comments to `just addnoun` and tweak logic by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#605 - feat: implements [#​841](Automattic/harper#841) by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#842 - Add WordPress Plugin Documentation and Demo by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#838 - feat: add `just newest-dict-changes` by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#701 - Spellcheck improvements by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#844 - fix: add missing "gotten rid off" to other "rid off" by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#840 - Rules page improvements by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#843 - build(deps): bump axios from 1.8.1 to 1.8.2 in /packages by [@​dependabot](https://github.com/dependabot) in Automattic/harper#845 - Regexish by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#669 - fix: fall back to `grep` when `rg` is not available by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#848 - feat: flag "monumentous" and offer "momentous" and "monumental" by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#864 - build(deps-dev): bump svelte-check from 4.1.4 to 4.1.5 in /packages by [@​dependabot](https://github.com/dependabot) in Automattic/harper#874 - build(deps): bump typst-syntax from 0.13.0 to 0.13.1 by [@​dependabot](https://github.com/dependabot) in Automattic/harper#867 - build(deps-dev): bump typescript from 5.7.3 to 5.8.2 in /packages by [@​dependabot](https://github.com/dependabot) in Automattic/harper#871 - build(deps-dev): bump autoprefixer from 10.4.20 to 10.4.21 in /packages by [@​dependabot](https://github.com/dependabot) in Automattic/harper#873 - Dictionary curation 2025 03 08 by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#860 - feat: add many variants of "change of tact"->"tack" by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#852 - feat: implement [#​525](Automattic/harper#525) (worse/worst confusion) by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#856 - build(deps): bump cached from 0.54.0 to 0.55.1 by [@​dependabot](https://github.com/dependabot) in Automattic/harper#868 - build(deps): bump anyhow from 1.0.96 to 1.0.97 by [@​dependabot](https://github.com/dependabot) in Automattic/harper#865 - Build against an older GLIBC version by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#877 - Cache busting by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#876 - build(deps): bump thiserror from 2.0.11 to 2.0.12 by [@​dependabot](https://github.com/dependabot) in Automattic/harper#866 - build(deps): bump serde_json from 1.0.139 to 1.0.140 by [@​dependabot](https://github.com/dependabot) in Automattic/harper#869 - feat: add a lint to correct "in of itself" to "in and of itself" by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#863 - feat: implement "ticking time clock" by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#851 - feat: implements [#​746](Automattic/harper#746) by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#855 - feat(dict): added words to dictionary by [@​ficcdaf](https://github.com/ficcdaf) in Automattic/harper#847 - fix: Ignore hex codes inside rgb function calls by [@​grantlemons](https://github.com/grantlemons) in Automattic/harper#857 - feat: Added Linux musl compilations by [@​kiding](https://github.com/kiding) in Automattic/harper#878 #### New Contributors - [@​kiding](https://github.com/kiding) made their first contribution in Automattic/harper#878 **Full Changelog**: Automattic/harper@v0.24.0...v0.25.0 </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this MR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box --- This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOS4xOTIuMCIsInVwZGF0ZWRJblZlciI6IjM5LjIxMC4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJSZW5vdmF0ZSBCb3QiXX0=-->
Issues
This PR solves a bug introduced by #669. Pinging @hippietrail for visibility.
Description
The Regexish PR neglected to include indexing in its unit test suite. As a result, it could emit tokens with indices that lay outside the document, which caused character Span fetches to fail.
This PR fixes the off-by-one error and improves test coverage to match.
Checklist