-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coding style guide #95
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jlamypoirier, the guide is good, but I find the tone a bit rigid and authoritarian. Please make this feel more collaborative and pair more prescriptive rules with explanations of why they matter. The occasional "please" would help as well.
@@ -2,6 +2,94 @@ | |||
title: Style Guide |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you run markdownlint's auto-formatter over this file, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do I do that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I installed the markdownlint extension in vscode and am since then able to run auto formatting of md files with it. I googled a bit, and it seem that running markdownlint's auto-formatter outside of vscode it's necessary to install a javascript tool:
npm install -g markdownlint-cli2 markdownlint-cli2-config-prettier
Then you can run
markdownlint-cli2 "docs/**/*.md" --fix
to fix everything in a folder. It uses the .markdownlint.yaml
that we've got in the docs
folder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to make things simpler? I would strongly prefer something handled with pre-commit. Otherwise it's not enforceable.
If we keep this it needs to be documented (this style guide?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you've just got to install this npm package in your environment, which requires nodejs. this can be done in CI/docker as well.
there is no python version of markdownlint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use <!-- markdownlint-disable-next-line -->
or the pair of <!-- markdownlint-disable -->
and <!-- markdownlint-enable -->
to make exemptions to linting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm if it was just about this comment I would be fine with it, but I tried running the command and in addition to making a whole lot of changes (had another look, seems like these are mostly outside the docs directory) it returned a whole lot of errors. I assumed this was because of wrong config, but maybe that's not the case. I'll fix the list indents, but we can't really add the pre-commit or enforce markdownlint until this is fixed.
markdownlint-cli2 v0.16.0 (markdownlint v0.36.1)
Finding: docs/recipes/continue-training-llama-8b.md docs/developers/contributing.md docs/about-us.md README.md
Linting: 4 file(s)
Summary: 6 error(s)
README.md:1:1 MD033/no-inline-html Inline HTML [Element: div]
README.md:1 MD041/first-line-heading/first-line-h1 First line in a file should be a top-level heading [Context: "<div align="center" style="mar..."]
README.md:3:1 MD033/no-inline-html Inline HTML [Element: img]
README.md:9 MD036/no-emphasis-as-heading Emphasis used instead of a heading [Context: "Accelerating your LLM training..."]
README.md:110 MD024/no-duplicate-heading Multiple headings with the same content [Context: "Prerequisites"]
README.md:116 MD024/no-duplicate-heading Multiple headings with the same content [Context: "Steps"]
markdownlint-cli2 v0.16.0 (markdownlint v0.36.1)
Finding: docs/quick-start.md .github/PULL_REQUEST_TEMPLATE.md .github/ISSUE_TEMPLATE/bug_report.md CODE_OF_CONDUCT.md
Linting: 4 file(s)
Summary: 37 error(s)
.github/ISSUE_TEMPLATE/bug_report.md:10 MD025/single-title/single-h1 Multiple top-level headings in the same document [Context: "🐞 Describe the Bug"]
.github/ISSUE_TEMPLATE/bug_report.md:14 MD025/single-title/single-h1 Multiple top-level headings in the same document [Context: "🔄 Steps to Reproduce"]
.github/ISSUE_TEMPLATE/bug_report.md:37 MD025/single-title/single-h1 Multiple top-level headings in the same document [Context: "🎯 Expected Behavior"]
.github/ISSUE_TEMPLATE/bug_report.md:41 MD025/single-title/single-h1 Multiple top-level headings in the same document [Context: "📜 Environment Information"]
.github/ISSUE_TEMPLATE/bug_report.md:108 MD025/single-title/single-h1 Multiple top-level headings in the same document [Context: "📝 Additional Context"]
docs/quick-start.md:31 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:46 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:106 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:110 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:184 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:188 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:196 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:205 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:231 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:252 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:276 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:284 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:292 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:333 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:471 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:475 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:481 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:535 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:615 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:619 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:630 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:641 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:682 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:834 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:840 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:846 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:854 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:879 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:902 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:933 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:947 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/quick-start.md:963:1 MD053/link-image-reference-definitions Link and image reference definitions should be needed [Unused link or image reference definition: "^peak-tflops"] [Context: "[^peak-tflops]:"]
markdownlint-cli2 v0.16.0 (markdownlint v0.36.1)
Finding: docs/help.md docs/index.md .github/ISSUE_TEMPLATE/feature_request.md docs/developers/style-guide.md
Linting: 4 file(s)
Summary: 7 error(s)
.github/ISSUE_TEMPLATE/feature_request.md:10 MD025/single-title/single-h1 Multiple top-level headings in the same document [Context: "🧐 Problem Description"]
.github/ISSUE_TEMPLATE/feature_request.md:15 MD025/single-title/single-h1 Multiple top-level headings in the same document [Context: "💡 Proposed Solution"]
.github/ISSUE_TEMPLATE/feature_request.md:20 MD025/single-title/single-h1 Multiple top-level headings in the same document [Context: "🔄 Alternatives Considered"]
.github/ISSUE_TEMPLATE/feature_request.md:25 MD025/single-title/single-h1 Multiple top-level headings in the same document [Context: "📈 Potential Benefits"]
.github/ISSUE_TEMPLATE/feature_request.md:30 MD025/single-title/single-h1 Multiple top-level headings in the same document [Context: "📝 Additional Context"]
docs/developers/style-guide.md:28 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/developers/style-guide.md:66 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
markdownlint-cli2 v0.16.0 (markdownlint v0.36.1)
Finding: docs/recipes/upcycle-llama-3b-to-moe.md fast_llm/models/custom/readme.md docs/join-us.md CONTRIBUTING.md
Linting: 4 file(s)
Summary: 0 error(s)
markdownlint-cli2 v0.16.0 (markdownlint v0.36.1)
Finding: docs/README.md docs/success-stories/starcoder-2.md docs/reference/configuration.md docs/developers/dev-practices.md
Linting: 4 file(s)
Summary: 0 error(s)
markdownlint-cli2 v0.16.0 (markdownlint v0.36.1)
Finding: SECURITY.md docs/recipes/train-llama-8b.md docs/license.md docs/recipes/data-preparation.md
Linting: 4 file(s)
Summary: 5 error(s)
docs/recipes/data-preparation.md:44 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/recipes/data-preparation.md:99 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/recipes/data-preparation.md:107 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/recipes/data-preparation.md:117 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
docs/recipes/data-preparation.md:156 MD046/code-block-style Code block style [Expected: fenced; Actual: indented]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to turn these into a warning and do the commit anyway in case of failure? I guess errors like this need to be addressed for PRs, but I don't want this to block every single commit...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure, this would be great though!
docs/developers/style-guide.md
Outdated
|
||
* Mark private and protected variables with an underscore `_` prefix. | ||
As is customary in python, we make no distinction between the two and avoid the double-underscore `__` notation. | ||
* Keep public interfaces (methods and variables without underscore prefix) as lean as possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would help if it was explained/exemplified what should remain public vs. private
docs/developers/style-guide.md
Outdated
and type validation for configurations: | ||
|
||
* Always use type hints for the public interface of a classes and modules. | ||
Type hints for method outputs may be omitted if they can be easily inferred. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we be more precise here? What's "easily inferred" depends on the person who writes/reads the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really know what to put there. So far I've mostly ignored output type hints except for abstract classes and when pycharm gives me trouble, but I think we need a better rule
docs/developers/style-guide.md
Outdated
|
||
We use the following conventions for imports (other than those enforced by isort): | ||
|
||
* Import standard library and third party modules by module (ex. `import package.module`, not `from package.module import method`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why prefer import package.module
universally? Fast-LLM is the only library I know that does that. I think this needs to be well motivated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit of a personal preference, but Fast-LLM doesn't use that many external libraries and nearly all of them keep module identifiers short so it's not a big difference. Main benefits are:
- Consistency: we're not doing things different for every file depending on who wrote it.
- We know what package an identifier comes from with just a quick look at the code, no digging at all.
- It's especially useful for commonly used names that could come from multiple packages, or that are also present in Fast-LLM (one example: Fast-LLM's config has many names in common with dataclasses, omegaconf and pydantic)
- We could allow exceptions, but this would be subjective and hard to manage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for spelling it out. Could you add points 1-3 to the doc, please?
docs/developers/style-guide.md
Outdated
We use the following conventions for imports (other than those enforced by isort): | ||
|
||
* Import standard library and third party modules by module (ex. `import package.module`, not `from package.module import method`). | ||
This keeps identifier's origin explicit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at the expense of very long identifiers in the codebase, which could be alleviated with as
imports, but that is not allowed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hasn't really been a problem so far. Almost every identifier shows as package.identifier
, so isn't really long.
docs/developers/style-guide.md
Outdated
* Import standard library and third party modules by module (ex. `import package.module`, not `from package.module import method`). | ||
This keeps identifier's origin explicit. | ||
* Avoid renaming with `as`, except for some (arbitrarily chosen) common ones: `numpy as np`, `triton.language as tl`. | ||
* Import first-party modules through specific identifiers (ex. `from fast_llm.module import method`, not `import fast_llm.module`). This facilitates the tracking of where those objects are used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pycharm isn't clever enough to figure out symbol usage otherwise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pycharm and other IDEs are, but not humans.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so you're saying that you use full-text search over the codebase for the method name to figure out where they are referenced? I don't quite follow this otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is we can see what exactly is used in a file just by looking at the top of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also for Fast-LLM names the identifiers would get really long, so importing by name is needed either way.
Co-authored-by: Torsten Scholak <torsten.scholak@googlemail.com>
Co-authored-by: Torsten Scholak <torsten.scholak@googlemail.com>
Co-authored-by: Torsten Scholak <torsten.scholak@googlemail.com>
Co-authored-by: Torsten Scholak <torsten.scholak@googlemail.com>
Co-authored-by: Torsten Scholak <torsten.scholak@googlemail.com>
Co-authored-by: Torsten Scholak <torsten.scholak@googlemail.com>
Co-authored-by: Torsten Scholak <torsten.scholak@googlemail.com>
I agree, but not really sure how to improve this. Do you have concrete suggestions? |
docs/developers/style-guide.md
Outdated
``` | ||
|
||
!!! note "More on automated formatting" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<!-- markdownlint-disable-next-line --> |
or just remove the empty line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did we run auto-format on the svg? I'm seeing a change without change :D
Sure, for example we can replace rather directive language with inclusive phrasing:
Add occasional use of “please” where it softens prescriptive rules:
Furthermore, we can add some context for the rules and offer a rationale that shows they're grounded in common goals:
Then we can frame rules as recommendations rather than imperatives:
We can make things sound more collaborative by using language that gives a sense of shared ownership:
Lastly, we could use mkdocs' "admonitions" to add "why this matters" boxes:
|
I tried pre-commit on the whole repo to try out the markdownlint one. Apparently the svg was missing a newline at the end. |
Thanks for the suggestions. I'll try implementing most of them (or equivalent), but some need more discussion.
I'd like to keep the list explicit for consistency. For example,
This makes things ambiguous though, it implies we don't enforce the rule when we should. Recommendations should be for things that are actually flexible.
Good idea, this will also keep the bullet points concise. |
.github/ISSUE_TEMPLATE/bug_report.md
Outdated
- Recent changes to the configuration or code. | ||
- Whether the issue occurs consistently or intermittently. | ||
- Any troubleshooting steps you have already tried. | ||
- Recent changes to the configuration or code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why pre-commit did that...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the linter shouldn't even run here. Let's just apply it to docs/**/*.md
in this iteration, please.
I see your point about avoiding ambiguity, but I think framing everything as a strict rule is unnecessary and risks creating resistance rather than collaboration. People generally want to follow guidelines, especially when they're clearly written and explained. Recommendations, when paired with context and rationale, are just as effective because they trust contributors to use their judgment. Strict enforcement and rigid rules aren't the only way to ensure adherence. Creating a guide that feels approachable and collaborative will encourage people to follow it without feeling policed. Code reviews can gently remind contributors to stick to recommendations where appropriate. Let's trust in people's willingness to help and follow guidelines when they are clear and reasonable. |
.markdownlint.yaml
Outdated
@@ -20,13 +20,23 @@ MD010: | |||
# MD013/line-length : Line length : https://github.com/DavidAnson/markdownlint/blob/v0.32.1/doc/md013.md | |||
MD013: false | |||
|
|||
# MD024/no-duplicate-heading Multiple headings with the same content (disabled because we do it). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we cannot use the same .markdownlint.yaml
for both the /docs
and /
, because the indentation works differently for mkdocs and GitHub. Let's not make this change at this point and optimize linting first for /docs
and make sure that all bullet points and enumerations are rendered as intended. This is currently not the case.
README.md
Outdated
- 💻 Transparently developed on GitHub with public [roadmap][roadmap] and [issue tracking][issues]. | ||
- 🤝 Contributions and collaboration are always welcome! | ||
1. 🚀 **Fast-LLM is Blazingly Fast**: | ||
- ⚡️ Optimized kernel efficiency and reduced overheads. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, this is not even wrong :D
Co-authored-by: Torsten Scholak <torsten.scholak@googlemail.com>
@tscholak I tried restricting the hook and reverting some changes, it still runs on all committed files.
|
.pre-commit-config.yaml
Outdated
hooks: | ||
- id: markdownlint-cli2 | ||
name: markdownlint | ||
files: "docs/*.md" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's probably wrong, but it complained that the other one wasn't a valid regex, and now this one gives a warning that it's a regex and not a glob 🤷
.pre-commit-config.yaml
Outdated
hooks: | ||
- id: markdownlint-cli2 | ||
name: markdownlint | ||
files: "docs/*.md" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this doesn't work. I'm getting this warning and also no files are being found:
$ pre-commit run --all-files
[WARNING] The 'files' field in hook 'markdownlint-cli2' is a regex, not a glob -- matching '/*' probably isn't what you want here
...
markdownlint.........................................(no files to check)Skipped
Try this, it works for me:
files: "docs/*.md" | |
files: ^docs/.*\.md$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to also work with just docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jlamypoirier!
✨ Description
An attempt to write a coding style guide.
Also set the python version to 3.12 at missing places.
Feel free to suggest modifications and/or improve wording.
🔍 Type of change
Select all that apply: