-
Notifications
You must be signed in to change notification settings - Fork 6
Add Vale for content linting to the repo #541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request adds Vale content linting to the repository by introducing Vale configuration, custom style rules, and a new CI job to run Vale. Key changes include:
- Adding a new Vale configuration file (.vale.ini) and custom styles under .github/styles.
- Updating license files and issue templates to improve consistency and adhere to new styling choices.
- Adding a new GitHub Actions workflow job to run Vale and enforce content linting.
Reviewed Changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
tests/data/test_package_generation/LICENSE.md | Updated copyright text formatting. |
tests/data/test_package_generation/.github/ISSUE_TEMPLATE/feature_request.yml | Updated placeholder punctuation using a Unicode ellipsis. |
tests/data/test_package_generation/.github/ISSUE_TEMPLATE/bug_report.yml | Changed ellipsis formatting in code snippets for consistency. |
docs/pages/virtual.md | Modified venv description to remove redundant wording. |
docs/pages/libraries/parallel-async.md | Adjusted punctuation in the dask description. |
docs/pages/docs.md | Simplified language in documentation description. |
docs/pages/ci.md | Revised the Coveralls service description for clarity. |
docs/LICENSE | Updated license formatting. |
README.md | Inserted Vale annotation comments to bypass unwanted style checks. |
LICENSE.md | Updated license formatting. |
.vale.ini | New Vale configuration file added. |
.github/workflows/linting.yml | Added a new CI job to run Vale. |
.github/styles/custom/spaces.yml | New custom style rule enforcing single spacing. |
.github/styles/custom/quotes.yml | New custom style rule enforcing straight quotes. |
.github/styles/custom/oxford-comma.yml | New custom style rule encouraging the use of the Oxford comma. |
.github/styles/custom/no-contractions.yml | New custom style rule to discourage contractions. |
.github/styles/custom/endash.yml | New custom style rule to avoid en dashes and suggest em dashes in parentheses. |
.github/styles/custom/dashes.yml | New custom style rule to remove spaces around dashes. |
.github/styles/custom/british-spellings.yml | New rule set for enforcing British spellings. |
.github/ISSUE_TEMPLATE/feature_request.yml (in .github folder) | Updated placeholder punctuation using a Unicode ellipsis. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I really like the language simplifications. I think we want this 🫐 .
This was just a quick pass before seeing your talk. Curious about uncluttering the custom config. Could we make our own ARC vale
language style erm... theme? (Not sure what's the correct word)
LICENSE.md
Outdated
@@ -1,6 +1,6 @@ | |||
# MIT License | |||
|
|||
Copyright (c) 2023-2025 UCL Centre for Advanced Research Computing | |||
Copyright © 2023-2025 UCL Centre for Advanced Research Computing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. It's from vale
. Probs not this one, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the practice involving licences? Once made can never change? Why do we use the (c)
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well the GPL specifically says "thou shalt not change this file".
MIT doesn't have any such clause. But to avoid potential problems, I would not reword license files.
(c)
→ ©
is probably innocent enough. I would leave as is, but ¯\_(ツ)_/¯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait I realise you mean why the (c)
at all: because it's not in the MIT template. That's a good point. Did it come from GitHub?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know where it came from, it is a bit weird though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know enough about licences, but it does seem weird that I can't reword my own licence. Guess it makes sense for GPL. Like what if you made a typo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. Realised we need to ignore the license files in both the repo and the template.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of having a prose linter but I would say we probably want some discussion of what rules to include (possibly this should be needs-2-reviewers
?). I personally don't think the ellipsis replacement should be included as I would say the marginal gain in readability of the Unicode character in cases where it renders nicely (when not using a fixed-width font) are outweighed by the cases where it decreases readability (when showing in a fixed-width font for example) or might change the meaning of Python code snippets embedded in non-Python files. I also think the -ize/-ization/-ized/-izable to -ise/-isation/-ised/-isable replacements as currently implemented might end up with spurious matches, and I'd personally vote for not standardizing (😛) on -ise endings to begin with (but agree with at least having a consistent convention). I also find the removal of all very instances a bit odd.
@@ -21,7 +21,7 @@ body: | |||
value: |- | |||
import cookiecutter_test | |||
|
|||
... | |||
… |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say we want to disable the ellipsis replacement because of places like this - this is Python code in a YAML file and I suspect we'll have other cases like this where we have for example Python code in a Markdown file that we don't want it touch. While it doesn't really matter here as this is meant more to indicate user to fill in rest of reproducing code, while ...
is valid Python syntax …
raises a syntax error ('Invalid character').
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a fair enough reason. @K-Meech raised the point of screen readers for prose though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file shouldn't have changed as not a Markdown file. This is me copy-and-pasting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paddyroddy - I'm not sure what you mean by 'screen readers for prose' here? As far as I'm aware screen readers will handle both ellipsis styles similarly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I thought that was something said the other day. I have no idea what they do with ellipsis.
{{cookiecutter.project_slug}}/.github/ISSUE_TEMPLATE/bug_report.yml
Outdated
Show resolved
Hide resolved
🗳️ This guy wants: readability and accessibility suggestions inline in my PRs that have a lot of prose. I'm actually mostly interested in the accessibility of what I write. ... Should we run the I'm not super bothered by consistency of contractions or em-dashes (though I understand the "-ize" sentiment, I would support consistent British spelling and fewer hedging and weasel words). |
This was very much for demo purposes. Obviously, we'd need to decide on what rules we want. |
@samcunliffe @matt-graham I have reverted the changes, excluded licence files, made CI not error. To see the output you need to click on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR integrates Vale for linting markdown content by adding a configuration file and CI workflow.
- Introduces
.vale.ini
to define Vale styles, packages, and rule overrides. - Adds a new
lint-prose
GitHub Actions job to run Vale checks on the repository.
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
.vale.ini |
Defines Vale settings, style paths, enabled packages, and rule overrides for .md files. |
.github/workflows/linting.yml |
Adds a lint-prose job using the errata-ai/vale-action to run Vale in CI. |
.vale.ini
Outdated
write-good.TooWordy = NO | ||
|
||
# Ignore lines starting with "> [!NOTE]" | ||
BlockIgnores = (?s)> \[!NOTE\].*?(\n|$) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This replaces the inline comment before @matt-graham
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eeeeeh what about TIP
IMPORTANT
WARNING
CAUTION
? Do they get flagged.
I could imagine > [!TIP]
is quite useful.
BlockIgnores = (?s)> \[!NOTE\].*?(\n|$) | |
BlockIgnores = (?s)> \[!(CAUTION|IMPORTANT|NOTE|TIP|WARNING)\].*?(\n|$) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah they would be
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah so it was the > ![NOTE]
alert syntax specifically that was triggering not the content of the alert? I guess we could also include the other alert types here (!TIP
, !IMPORTANT
, !WARNING
, !CAUTION
). I don't entirely understand the regex here - it looks like Value uses regexp2
which is based on Go regexp
, but after reading the docs I'm still entirely sure what (?s)
bit is doing here (appears to be setting a flag to let .
match \n
but not sure why this is needed and whether it applies just to the group?). I think something like
^ *> \[!(NOTE|IMPORTANT|WARNING|CAUTION|TIP)\] *$
would avoid unintentionally capturing other characters with .
while still allowing for white space before or after alert syntax (could also use \w
in place of
but not sure we actually want to allow for other white space types).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just spotted that you've already covered all altert types in last commit 73fd5c1 - I still don't entirely understand the regex but it works so marking as resolved! (EDIT: I hadn't refreshed before commenting so hadn't seem Sam's comment either).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I assume you meant \s
in place of
? Hadn't occurred to me that \s
was any type of whitespace. This is just in this flavour of regex? I've always taken it to mean strictly
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah bad news, yours doesn't work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in the example linked to in the docs there it looks like it's excluding the whole block {< file }
delimited block. Are we trying to similarly ignore the context of callouts or just the syntax which starts an alert? From https://regex101.com/r/rvRsMy/1 it looks like we just match the initial bit not the content. If so I think the regex I suggested will do so as well but is tighter / less chance for false matches
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went with (?s)> \[!(CAUTION|IMPORTANT|NOTE|TIP|WARNING)\](\n|$)
and tested on regex101
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we want to just ignore the top line itself, not the contents within it. Your regex doesn't work, feel free to try it yourself locally (run vale sync; vale .
)
Co-authored-by: Sam Cunliffe <samcunliffe@users.noreply.github.com>
Co-authored-by: Matt Graham <matthew.m.graham@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than potentially excluding code of conduct looks good to me, thanks @paddyroddy!
In this PR I add vale similarly to paddyroddy/talks#80 paddyroddy/talks#82. See this blog post for an explanation of what it is https://www.datadoghq.com/blog/engineering/how-we-use-vale-to-improve-our-documentation-editing-process. I used this extensively during my PhD https://github.com/paddyroddy/phd-thesis/blob/main/.vale.ini. I'm going to talk about this in the Collaboration Hour, so this PR is in prep for that.
The talk: https://paddyroddy.github.io/talks/linting-prose-with-vale.