William Pursell wrp

general best practices:

Have a single source of truth.
Avoid usage spew. ¹
A stack trace is not an error message; it is an embarrassment.
Write errors to stderr. Write output to stdout. Write logs somewhere else.
Instrument extensively, but don't use logs as metrics.
Version everything, but do not hardcode version numbers in a file; derive them from a VCS.
Never deploy a mutable ref (eg "latest"); always use a fixed version.
Design for the interface, not the implementation.
Use descriptive names for projects. Call your auth service "auth-service", not "jezebel".
Don't use the implementation as a basis for the name of a service. ²
Always include timezone in a timestamp representation. ISO-8601/rfc3339 ³
Never use relative timestamps (eg, "6 hours ago") ⁴
Keep your code narrow. ⁵
Make everything local. Web-based docs are useless on an airplane. Put your docs in git.
Configuration files are not documentation. ⁶
Comments are not documentation. ⁷
Test functionality, not platform. ⁸
Use roles extensively. ⁹
Plan for failure, automate, test extensively.

pithy slogans:

Explicit is better than implicit.
Fail loudly (but tersely), succeed quietly. ¹⁰
Simple, elegant, robust. Choose 3.

vcs best practices:

Vigorously enforce commit message formats from the first commit.
Keep all changes small and obvious. History matters. Churn matters. This impacts style. ¹¹
Use the correct deictic center on commit messages.

git specific best practices:

Put meta data for commits in a git trailer. ¹²
Put only project specific items in .gitignore. ¹³

General thoughts:

Do not bend your workflow to satsify your tools; fix your tools to fit your workflow. For example, if someone tells you "Even though the syntax does not require it, we leave double quotes here because some IDEs flags it as a syntax error", tell them to stop using broken IDEs.

notes:

Usage spew is the act of printing a "wall of text" in response to a simple error. The wall of text generally obscures the error message. A concrete example of this is the current behavior of 'git diff' when executed in a directory that is not a git repo. The user does not need to see 129 lines of text! The user only needs to see the first line ("warning: Not a git repository"), and the other 128 lines are just an irritating distraction. ↩
Humans should not care about the underlying implementation. If you name your message queue "kafka", then it will be extremely difficult to migrate to KubeMQ. This is the same rationale for avoiding .sh suffixes on your shell scripts and .py suffixes on your python scripts. If you have a process that frobs by calling frob.py, but you later realize that frobbing is better implemented in some other language, you suddenly have a problem. If you re-implement frob.py, you cannot easily change the name without breaking all existing tooling. The problem is completely avoided if you had originally named the program frob instead of frob.py. ↩
Never display or write "07:18". Always write "07:18:00-06:00" or "13:18:00Z" or "06:18T-7". It is less important how you choose to convey timezone, as long as it is done. And don't use names; use numbers. Quick, how much time is there between "07:18 PST" and "06:48 IST"? Approximately where in the world is EAT? Now, answer those questions for "07:18T-0700" and "06:48T+0530" and "UTC+3" Using timezone names requires external information, but that information is conveyed directly if you use the right format.

Never tell the user "We are using your timezone"; be explicit and tell them what you think their timezone is. When the user has their laptop's local time set to London but their configured address with your service is in New York and they are currently in Los Angeles while using a VPN based out of Singapore, then phrases like "your timezone" are confusing. GMT-7 is non-ambiguous (or, at least it is much less ambiguous), and your user's are smart enough to know what it means. If they are not, then you should educate them. ↩
Strings like "6 hours ago" have no meaning when they've been cut-n-pasted or are viewed in a screen shot from 5 days ago, or many other unforseen contexts. Unambiguous ISO-8601/rfc3339 conforming strings are better. (https://en.wikipedia.org/wiki/ISO_8601) (https://www.rfc-editor.org/rfc/rfc3339) ↩
Restricting line widths to a reasonable value (eg, 78) allows more views. There is a big push for allowing 120 or 160, or unlimited columns. There is some validity to the argument that we should not restrict column widths because doing so is essentially bending our workflow to satisfy the tooling. I agree with that. But lately I've been looking at a lot of code on a mobile device in portrait mode or on a large screen with 3 or 4 terminals side by side, where 60 columns is pushing it and having narrow code makes it much easier to read. As a practical concern, perhaps the goal should not be a maximum line width, but rather a maximum non-whitespace width. Eg, you can have code go to column N, but only if the first non-whitespace character on that line and in the few lines above and below is at line N - 80. A human reader should be able to have a view into your code that allows them to see adequate context on a width of 80. Personally, I would push for 60 so I don't need to turn my device sideways and go into landscape mode, but 80 has historical precedence. But excessive indendation is a sign of a lack of decomposition, so just keep your line widths below 80 (allowing for 8 space indentation). As I write this, I have 3 terminals currently open on a width of 30; don't just assume your reader has a lot of real estate to waste on excessive indentation or Java-esque naming schemes. ↩
Comments in a config file should be constrained to comments on your configuration. Many projects ship with highly documented config files (95% of the file is comments!) and it is common to see production config files with all of that commentary left inside. Don't do that. If you need to know what a configuration item does, look in the documentation. If you want to document why your site has that item set to a particular value, it is (maybe) appropriate to put a comment in the config file. A 1000 line config file in which 990 of the lines are commentary is much more difficult to read than a simple 10 line config, and you certainly don't need all that documentation repeated on every production replica. ↩
Names describe "what", code describes "how", comments describe "why". Comments can also be used to call out gotchas, or to summarize. There is a trend towards over-commenting that is disturbing, often driven by policies such as "every function must have a descriptive comment" or "every function parameter must have a comment". At first glance these policies may seem reasonable, but they are detrimental to code health. Accurate (but unnecessary) comments are at best distracting; too often code changes and the comments become both distracting and wrong. There should be one source of truth, and comments must not become a conflicting source. If the code needs comments to be understood, then the code should be refactored with better names. ↩
Never test for the platform or program that you are running to decide if you have access to certain functionality. eg, instead of trying to determine that the current shell is zsh and therefore you can use some zsh specific syntax, just test if the syntax is valid. Similiarly, instead of checking to see if you are running on Darwin and can therefore assume the presence of some particular libary function, test for the existence of the function. The tests are harder to write, but far more robust. ↩
Always identify a role as an owner of a resource rather than an individual. Instead of putting a user id in a metadata file showing that person as an owner of a repository, indicate a team or a role. Always. Individuals will be identified by their authorship of commits in the VCS, and that is really the only place a person should be identified. Using roles also helps keep post-mortems blameless. Infrastructure must exist to enable identifying a person from a given role, and it is worth the effort to build that infra early. ↩
When an error occurs, a process should write a simple, concise, accurate error message to stderr and terminate. Error messages should not offer suggestions and certainly not a usage statement. For example, instead of: "The connection to the server localhost:8080 was refused - did you specify the right host or port?" a better error message would be simply: "localhost:8080: connection refused" The rest is line noise. Perhaps include the name of the executable. Maybe include the uid or human readable name of the process owner. Do not include a timestamp by default (let the caller wrap the command if timestamps are desired, maybe add a flag to enable timestamps). ↩

Changes in code need to be contained. Given the following two diffs, consider how long it takes to understand fully what each does:

Compare this patch:

-        auto foo = this_function_has_a_long_name(a_long_named_arugment,
-                                                 another_long_argument,
-                                                 a_third_argument_with_a_longish_name,
-                                                 yet_another_absurdly_long_name,
-                                                 your_dog_is_named_barry,
-                                                 most_important_argument);
+        auto foo = new_name(a_long_named_arugment,
+                            another_long_argument,
+                            a_third_argument_with_a_longish_name,
+                            yet_another_absurdly_long_name,
+                            your_dog_is_named_harry,
+                            most_important_argument);

To this patch

-       auto foo = this_function_has_a_long_name(
+       auto foo = new_name(
		a_long_named_arugment,
		another_long_argument,
		a_third_argument_with_a_longish_name,
		yet_another_absurdly_long_name,
-               your_dog_is_named_barry,
+               your_dog_is_named_harry,
		most_important_argument

To the casual reader, the first patch obscures the fact that one of the variables being passed to the function has changed. Either of these patches is a terrible thing to commit, as there are two distinct changes happening here. One commit should change the name of the function, a second commit should change the variable that is being passed. In a code review, the first style makes it hard for a human to see that 2 distinct changes are happening. ↩

https://wrp.github.io/blog/2024/01/03/conventional-commits.html ↩
Patterns to protect against user specific names (like files generated by an IDE, or patterns like *~, or OS-specific files like .DSLOCAL) DO NOT belong in the project's .gitignore. Put them in $HOME/config/git/ignore ($XDG_CONFIG_HOME/git/ignore) ↩

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

William Pursell wrp

Achievements

Achievements

Block or report wrp

general best practices:

pithy slogans:

vcs best practices:

git specific best practices:

General thoughts:

notes:

Pinned Loading

Uh oh!

William Pursell wrp

Achievements

Achievements

general best practices:

pithy slogans:

vcs best practices:

git specific best practices:

General thoughts:

notes:

Footnotes

Pinned Loading

Uh oh!