-
Notifications
You must be signed in to change notification settings - Fork 56
Variable naming
Picking "good" variable names is perhaps the most difficult part in any self disciplined programing style. It is more challenging with Spartan programming which prefers short, even overly terse, names to longer ones. The challenge is in being laconic without being cryptic. This article suggest a few guidelines that may help in this endeavor.
(No futile attempts are made here of meeting the "challenge" in presenting naming guidelines in a non-controversial manner. To quote Wikipedia on this topic, The choice of naming conventions can be an enormously controversial issue, with partisans of each holding theirs to be the best and others to be inferior. Colloquially, this is said to be a matter of dogma...)
Pick a consistent naming convention, and stick to it. Uniformity is among the few indisputable readability aids.
Consistency and uniformity in naming, combined with well thought principles, makes it possible to use short names more effectively. (For example, the use of two letter acronyms for any one of the 50 states comprising the United States (FL is Florida, NY is New-York, etc.) is deeply entrenched into the western culture. )
Spartan programming offers only a few such principles:
- Generic names policy.
- Short return value name
- Short anonymous name
Names must fully describe the named entity, identifying it among all other entities in the same context. No mental effort should be involved in determining the relationship between the name and the named entity.
If you cannot think of a good descriptive named to a variable, do not make up some random name. Rethink. Your design may be wrong.
The name of a variable should be made short, as short as possible, but not shorter than that. Pick the shortest name that accurately describes the named entity within its context.
One word names are preferable to multi-word names. Likewise, one letter names are better than two-letter names and one word names.
Brevity is not an end in itself. If the appropriate name of a variable consists of several words instead of one, then instead of using an ad-hoc acronym, the first word only or any other abbreviation, use this as a warning asking yourself whether too much functionality is loaded into the variable and whether the code could be rewritten in such a way that a shorter name makes more sense. In the rare cases that this cannot be done, use the appropriate long name obtained by concatentating the appropriate name.
Similarly, the very terse names offered by the generic naming technique cannot be used with all variables. A terse name makes sense only if a variable is a generic member of its type. With suitable modular decomposition, a large fraction, but not all, variables can be terse.
Spelling out names in full is easier than ever with modern development environments and their auto-completion and quick renaming features. Abbreviations can be misleading, and they may rely on the reader's cultural background in surprising ways.
It is usually OK to use abbreviations that have deeply penetrated to the everyday speak of programming, including such names as
i18n - internationalization
L10n - Localization
buff - Buffer
msg - message
len - length
tmp - temporary variable
But even these may be arguable. Consider for example msg. The term is pretty obvious to an individual whose native tongue is Hebrew, Arabic or any other language which does not use vowels. And, it may also be familiar and to programmers who have used BITNET. Others, who have come used to identifying the same three-letter combination as a shorthand for "monosodium glutamate", a "Master Sergeant", or for "Madison Square Garden", may be (slightly) confused.
Likewise, buff
may be confusing to to programmers of computer games, where it is used as a generic term for beneficial effects (often spells) in some games.
The preaching for long, verbose and "meaningful" identifiers is based on the presumption that identifiers may be accessed from many locations in the code, and that the full meaning of the identifier can only be obtained by the tiresome and long chore of examining all these locations. This preaching is also motivated by the belief that good, short names are sparse.
The secret of simultaneously satisfying both the brevity and the descriptiveness objectives is in making sure that variable names are relative to a small context. Remember that all names are relative to a context of interpretation, and large non-specific contexts require long, verbose names
If the context is a list of United States presidents, then both "Bush" and "George Bush" are too short. Use "George W. Bush" or "George H. W. Bush" depending on who you mean.
If the context is a discussion of the American revolutionary war, then the name "Washington" is short enough. The same name is insufficient however if the context is geography, where it could refer to the state, to the district, to any one of the 40 or so wold-wide cities so named, the 31 or so United States' counties, the numerous townships in this country, the islands, the lake, etc.
Accurate descriptive names within this context can be very long. Verbosity if uniformity is to be maintained. In the geography of the United states example, a uniform naming convention will probably include:
- The country (United States, England, etc.)
- The state or province at which the location is found.
- The name of the geographical location (Miami, Washington, etc.)
- The kind of the location (lake, town, district, etc.)
For example, the name WashingtonCityMaineUnitedStates
describes a city named "Washington" in the state of Maine in the United States.
All identifiers must be visually, semantically, and phontentically distinguishable from all other identifiers of the same scope.
Only use an acronym if you are sure that it is part of the reader's general vocabulary, or that it is established in the application domain. Do not invent your own acronyms
There is no need to exapnd acronyms such as ASCII, CPU, HTTP, URL, IDSN, FTP etc. which have come so established that many individuals will not be able to expand. Examples of domain specific acronyms include: DFS, which is instantly recognizable by anyone dealing with graph algorithms, DBC, which any individual in need of a Design By Contract library should know, and LAN in communication.
Unpronounceable names place an unnecessary burden on the reader. Avoid these.
For the same reason, do not use names which could be mispronounced, nor (with the absence of a compelling reason) names including figures.
Variable names should not encode their type (integer, real, boolean, etc.), nor their kind (local/parameter/global, constant/changeable, etc.) for several reasons:
- variables should be named in a sufficiently small context in which this information is obvious,
- inferring type and kind information is easy with modern development environments, and,
- encoding this information violates the Déjà vu principle; the extra work in maintaining this information and its encoding will not be welcomed by a maintainer.
For similar reasons, it is not a good idea to use names such as ISomething to encode the fact that a type is defined as a Java interface
rather than a class
, an enum
or an annotation
; the decision to implement a type as (say) a class rather than an interface is likely to change.
The conventions of using ALL-CAPITALS for named constants, Capitalized names for types, and lowercase names for fields, methods, and variables is acceptable precisely because the categorization of an entity into one of these kinds, is usually robust.