-
Notifications
You must be signed in to change notification settings - Fork 233
feat: allow ExprLinter to work on sentences as well as chunks
#2165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat: allow ExprLinter to work on sentences as well as chunks
#2165
Conversation
| "A test" | ||
| } | ||
|
|
||
| fn get_unit(&self) -> Unit { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the linter could return different values for different invocations, I'm not sure this is the best strategy.
Ideally, and I'm not sure if you could reuse this work for this, we would have two distinct traits, say ChunkExprLinter and SentenceExprLinter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did at first think I'd split it into two linters but I also thought making the change smaller and keeping the logic in one place (DRY) might help, such as the _with_context enhancement - which should combine with this smoothly.
What are your thoughts on parameterizing it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do it this way, we'd have to refactor a significant portion of the lint_group's caching system to account for sentences. My main concern is that the get_unit function can return different values on each invocation.
I agree with your sentiment around keeping things DRY. Maybe using an associated type would be a happy medium?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
associated type
I'll be investigating that today then!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do it this way, we'd have to refactor a significant portion of the
lint_group's caching system to account for sentences. My main concern is that theget_unitfunction can return different values on each invocation.I agree with your sentiment around keeping things DRY. Maybe using an associated type would be a happy medium?
I spent a few hours on this and feel like there's some advanced Rust stuff I've seen in videos but that is not yet within my reach that might help here.
The point is that each linter will only work on chunks or sentences and never both. Doing so would be an error but the way I coded it up in the PR has no way to enforce that.
I wanted to not have to change every ExprLinter. Maybe some way of making ExprLinter and ExprSentenceLinter as newtypes of a parameterized "abstract class" for want of a better term, that could somehow encapsulate iterate_chunks vs iterate_sentences into some kind of type or trait, but I'm not at a skill/knowledge level yet in Rust where I can even discern if that's possible, let alone implement it.
Any ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I should've been able to find a way to implement it sentence iteration without having to change every ExprLinter but I gave up and went with a way that modifies them all.
Is it going to avoid the problems with the caching?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like your solution solves the caching problem. I think it's good that we have to declare it in every linter, since that makes it easy to tell which is which at a glance.
…pr-linter-by-sentence
…trail/harper into expr-linter-by-sentence
elijah-potter
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff. This PR is almost perfect. I like the test cases, the implementation, with just a few nits about naming. Once this is merged, I'd love to see a follow-up with an actual sentence-based ExprLinter so we can have one running in production as soon as possible.
| /// | ||
| /// This function is not significantly different from [`Self::add`], but allows us to take | ||
| /// advantage of some properties of [`ExprLinter`]s for cache optimization. | ||
| pub fn add_expr_linter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is now restricted to only Chunk-based ExprLinters, I think the naming of this function and the associated expr_linter field should assume more appropriate names.
Something else to consider: should we have a specialized cache for sentence linters as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I think I did rename that to add_chunk_expr_linter in a previous attempt. Smaller diffs make PR reviewing easier though so changing it now is probably good.
As for the caching that's a part of the code I haven't looked at at all. I don't know if the associated types propagate the same way that parameterized types do but I think so right? And I don't know if the caching code is inherently affected by that type?
| "A test" | ||
| } | ||
|
|
||
| fn get_unit(&self) -> Unit { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like your solution solves the caching problem. I think it's good that we have to declare it in every linter, since that makes it easy to tell which is which at a glance.
The quintessential linters that would be affected by this are surely the Oxford comma linters, which are now just Linter's rather than ExprLinter's but fail if a list follows a clause? that ends with a comma. But that's probably a big job. I had some other problem in mind when I first thought of sentence iterators but I forget what that was... |
Issues
No issue but this is the result of my thoughts on #2164
Description
Currently
ExprLinterworks on "chunks", where a chunk can be thought of as a clause, or a section of text between commas.This means if you want to work on whole sentences, you either have to instead use
Linterand not be able to use patterns/expressions, or when a sentence contains a comma, you'll only see parts of a sentence at once.In the case of the potential missing-question-mark linter, that could result in an
ExprLinterthinking it's looking at the beginning or ending of a sentence (an interrogative or ending with something other than a?)This PR adds a way to optionally choose to run an
ExprLinterover sentences instead of chunks. The default is still chunks and no existing code needs to be changed to retain the default behaviour.How Has This Been Tested?
I added unit tests to use the same toy
ExprLinteron either chunks or sentences with a pattern containing a comma that fails for chunks and succeeds for sentences.Please go ahead and critique my implementation!
Checklist