-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add codespell pre-commit hook #2855
Conversation
Size Change: 0 B Total Size: 877 kB ℹ️ View Unchanged
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I have a number of suggestions so I think blocking merge to have these discussions, is justified. Also, I'm quite shocked at how many spelling mistakes we've committed!
catalog/tests/dags/providers/provider_api_scripts/test_wikimedia_commons.py
Outdated
Show resolved
Hide resolved
...ety/detecting_sensitive_textual_content/20230309-implementation_plan_sensitive_terms_list.md
Show resolved
Hide resolved
catalog/tests/dags/common/test_resources/image_table_sample.tsv
Outdated
Show resolved
Hide resolved
Thank you for the careful review, Dhruv. I'll make updates to this tomorrow! |
58f169d
to
c59cdba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice clean up, both in terms of combining the configuration into a single toml
file, and fixing all of the spelling errors. Thank you for detailed documentation, too!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like all my suggestions have been addressed, thank you! LGTM, LGTM!
* Add minimal codespell config * Flesh out codespell configuration and consolidate python tool configs * Correct codespell errors * Undo bad spelling switch and add ignore for the line * Move codespell docs to docs site * Remove unnecessary custom dictionary entries * Undo errant corrections
Fixes
Fixes #378 by @dhruvkb
Description
This PR does the following:
Codespell seems fine enough and works pretty well.
However, I would like to also explore using cspell. It has far greater flexibility in terms of the supported dictionaries and its configuration is significantly less obscure than Codespells. I'm going to leave this PR as a draft until I've tried cspell. If I decide to go with cspell, I'll open a new PR and keep the configuration consolidation I did for other Python-based tools in this PR. That seems useful regardless of which spell checker we choose.I tried out cspell and while it probably works great if you have a small project with little prose, it complains about every unknown word. At a glance I could not find anything in cspell's documentation about how to ignore unknown words, I suspect because its strategy for matching words is very different. Codespell is going to be much easier to maintain in the long run. It isn't perfect and will probably miss some things that cspell would catch, but it isn't worth the tremendous task of configuration and maintenance to use cspell. Keep in mind that "unknown words" includes most Vue related terms, all maintainer GitHub usernames (referenced frequently in project documentation), and so on and so on. In any case, Codespell is a good start. If we want to switch tools in the future we can do so. For now this gets us at least catching basic misspellings.Testing Instructions
Checkout the branch and run
just lint codespell
. Confirm there are no errors. Add a misspelling to a relevant file (one that isn't ignored) and confirm that codespell catches it when runningjust lint codespell
. Also, review the list of ignored files and confirm that it's comprehensive and does not accidentally include files that should be checked.Look through the spelling changes and confirm that there are no problems with the fixes I've applied. There is a mix of automatically applied fixes that codespell generated as well as fixes I had to do manually (for when there were multiple suggestions for a misspelled word).
Finally, read through the configuration documentation and confirm that it makes sense, is clear, and can be used effectively to maintain the configuration moving forward.
Checklist
Update index.md
).main
) or a parent feature branch.Developer Certificate of Origin
Developer Certificate of Origin