We all know that naming is one of the hardest problems in programming, and probably most of us have written code like this when we just started programming:
I wrote this code more than 20 years ago in Delphi, and, honestly, I don’t really remember what the app was supposed to do. It has it all: single-character names (i
, j
), abbreviations (...Cnt
, buf
), acronyms (E
, sr
, fp
). It has some comments though! (And I kept the original indentation for full immersion.)
I once worked with a very senior developer who used mostly very short names, and never wrote any comments or tests. Working with their code was like working with
Let’s look at these (and many other) naming antipatterns, and how to fix them.
Consider this method:
I can say a lot about this code but let’s focus on this line first:
The double negation, “if not no errors found…”, makes my brain itch, and I almost want to take a red marker and start crossing out !
s and no
s on my screen to be able to read the code.
In most cases we can significantly improve code readability by converting negative booleans to positive ones:
Positive names and positive conditions are usually easier to read than negative ones.
By this time we should already notice that we don’t need the errorsFound
variable at all: its value can always be derived from the errorMessages
array:
I’d also split this method into two to isolate side effects and make the code more testable, then remove the condition around this.set()
Let’s look at another example:
Here, again, every time we read noData
in the code, we need to mentally unnegate it to understand what’s really happening. And the negative disabled
attribute makes things even worse. Let’s fix it:
Now it’s much easier to read. (And we’ll talk about names like data
later.)
My rule of thumb: the shorter the scope of a variable, the shorter should be its name.
I’m okay, and even prefer, very short variable names for one-liners. Consider these two examples:
Here, it’s clear what x
is in each example, and a longer name would bloat the code without making it more readable, likely less. We already have the full name in the parent function: we’re mapping over the TRANSITION
object keys, and parsing each key; or we’re mapping over a list of breakpoints, and converting them to strings. It also helps that here we only have a single variable, so any short name will be read as “whatever we’re mapping over”.
I usually use x
in such cases. I think it’s more or less clear that it’s a placeholder and not an acronym of a particular word.
Some developers prefer _
, and it’s a good choice for any programming language that’s not JavaScript, where _
is often used for Lodash utility library.
Another convention I’m okay with is using a
/b
names for sorting and comparison functions:
However, when the scope is longer, or when we have multiple variables, short names could be confusing:
Here, it’s totally impossible to understand what’s going on, and meaningless names are one of the main reasons for this.
Let’s try to refactor this code a bit:
Not only the refactored code is three times shorter but it’s also much clearer: are there any (some) customers with at least one customer card in any (some) age group?
I’ve seen someone using _
name for something that’s used across the whole module, possibly dozens or even hundreds of lines or code, an Express router (the example is from Express docs but I changed the name):
I cannot imagine the logic behind this convention, and I’m sure it’s going to be confusing for many developers working with the code. It’ll be much worse when the code grows to do something useful.
Let’s bring back the original names:
Now I don’t have trouble understanding what’s going on here. (Using req
for request and res
for response is an Express convention: huge adoption makes it a good idea to keep using it.)
So, x
, a
, and b
are pretty much all single-character variable names I ever use.
On the other hand, long names in a short scope make code cumbersome:
Here long names make the code look more complex than it is:
I think the second version is easier to read.
One of the most common cases for short names is loops: i
, j
, and k
are one of the most common variable names ever, and are usually used to store loop indices. They are moderately readable in short not nested loops, and only because programmers are so used to seeing them in the code. However, in nested loops, it’s getting difficult to understand which index belongs to which array:
I used to use longer names for index variables for a very long time:
Surely, keyIdx
is way more readable than i
but, luckily, most modern languages allow us to iterate over things without coding artisan loops, and without the need for an index variable:
(See Avoid loops chapter for more examples.)
We talked a bit about the scope in the previous section. The length of the variable’s scope affects readability too. The shorter the scope the easier it is to keep track of what’s happening with a variable.
The extreme cases would be:
[8, 16].map(x => x + 'px')
).Usually, the shorter the scope, the better. However, religious scope shortening has the same issues as splitting code into many teeny-tiny functions (see Divide and conquer, or merge and relax chapter): it’s easy to overdo it and make the code less readable, not more.
I found that reducing the lifespan of a variable works as well, and doesn’t produce lots of tiny functions. The idea here is to reduce the number of lines between the variable declaration and the line where it’s accessed for the last time. The scope might be a whole 200-line function but if the lifespan of a particular variable is three lines, then we only need to look at these three lines to understand how this variable is used.
Here, the lifespan of the sorted
variable is only two lines. This kind of sequential processing is a common use case for the technique.
(See a larger example in the “Avoid Pascal style variables” section in the Avoid reassigning variables chapter.)
The road to hell is paved with abbreviations. What do you think are OTC, RN, PSP, SDL? I also don’t know, and these are just from one project. That’s why I try to avoid abbreviations almost everywhere, not just in code.
There’s a list of dangerous abbreviations for doctors prescribing medicine. We should have the same for programmers.
I’d even go further and create a list of approved abbreviations. I could only find one example of such a list: from Apple, and I think it could be a great start.
Common abbreviations are okay, we don’t even think of most of them as abbreviations:
Abbreviation | Full term |
---|---|
alt | alternative |
app | application |
arg | argument |
err | error |
info | information |
init | initialize |
lat | latitude |
lon | longitude |
max | maximum |
min | minimum |
param | parameter |
prev | previous (especially when paired with next ) |
As well as common acronyms:
And possibly a few very common ones used on a project but they still should be documented (new team members will be very thankful for that!), and shouldn’t be ambiguous.
I like to use a few prefixes for variable and function names:
is
, are
, has
, or should
for booleans (examples: isPhoneNumberValid
, hasCancellableTickets
).get
for (mostly) pure functions that return a value (example: getPageTitle
).set
for functions that store a value or React state (example: setProducts
)fetch
for functions that fetch data from the backend (example: fetchMessages
).to
for functions that convert the data to a certain type (examples: toString
, hexToRgb
, urlToSlug
).on
and handle
for event handlers (examples: onClick
, handleSubmit
).I think these conventions make code easier to read, and distinguish functions that return values and ones with side effects.
However, don’t combine get
with other prefixes: I often see names like getIsCompaniesFilterDisabled
or getShouldShowPasswordHint
, which should be just isCompaniesFilterDisabled
or shouldShowPasswordHint
, or even better isCompaniesFilterEnabled
. On the other hand, setIsVisible
is perfectly fine when paired with isVisible
:
I also make an exception for React components, where I prefer to skip the is
prefix, similar to HTML properties like <button disabled>
:
And I wouldn’t use get
for class property accessors (even read-only):
In general, I don’t like to remember too many rules, and any convention can go too far. A good example, and fortunately almost forgotten, is a Hungarian notation, where each name is prefixed with its type, or with its intention or kind. For example, lAccountNum
(long integer), arru8NumberList
(array of unsigned usName
(unsafe string).
Hungarian notation made sense for old untyped languages, like C, but with modern typed languages and IDEs that show types when you hover over the name it clutters the code and makes reading each name harder. So, keep it simple.
One of the examples of Hungarian notation in the modern frontend is prefixing TypeScript interfaces with I
:
Luckily, most TypeScript developers prefer to drop it these days:
I would generally avoid repeating information in the name that’s already accessible in its type, class name, or namespace.
(We talk a bit more about conventions in the Code style chapter.)
Imagine a function that allows us to build a new version of an object based on a previous version of the same object:
Here, we have a simple counter function that returns the next counter value. The prev
prefix makes it clear that this value is out of date.
Similarly, when the value is not yet applied and the function either lets us modify it or prevent the update:
Here, we want to avoid unnecessary component rerenders when the code
hasn’t changed. The next
prefix makes it clear that this value is going to be applied to the component after the shouldComponentUpdate
call.
Both of these conventions are widely used by React developers.
Incorrect names are worse than magic numbers (read about them in the Constants chapter). With magic numbers, we can make a correct guess but with incorrect names, we have no chance to understand the code.
Consider this example:
Even a comment doesn’t help to understand what this code does.
What’s actually happening here is getTime()
returns milliseconds and getTimezoneOffset()
returns minutes, so we need to convert minutes to milliseconds by multiplying minutes by the number of milliseconds in one minute. 60000 is exactly this number.
Let’s correct the name:
Now it’s much easier to understand the code.
Types (like TypeScript) could help us see when names don’t represent the data correctly:
By looking at the types, it’s clear that both names should be plural (they keep arrays) and the second one only contains order IDs but not whole order objects:
We often change the logic but forget to update the names to reflect that. This makes understanding code much harder and could lead to bugs when we later change the code and make wrong assumptions based on incorrect names.
Abstract and imprecise names are probably more unhelpful than dangerous, like incorrect ones.
Abstract names are too generic to give any useful information about the data they hold:
data
list
array
object
The problem with such names is that any variable contains data, and any array is a list of something. These names don’t say what kind of data it is, or what kind of things the list holds. Essentially, such names aren’t better than x
/y
/z
, foo
/bar
/baz
, New Folder 39
, or Untitled 47
.
Consider this example:
Besides using Immutable.js and Lodash’s get
method, which already makes the code hard to read, the obj
variable makes the code even harder to understand.
All this code does is reorganizes the data about the user’s currency into a neat object:
Now it’s clearer what shape of data we’re building here, and even Immutable.js isn’t so intimidating. I kept the data
name though because that’s how it’s coming from the backend, and it’s commonly used as a sort of root object for whatever the backend API is returning. As long as we don’t leak it to the app code, and only use it during the initial processing of the raw backend data, it’s okay.
Such names are also okay for generic utility functions, like array filtering or sorting:
Here arrays
and array
are totally fine since that’s exactly what they represent: generic arrays, we don’t yet know what they are going to hold, and for the context of this function it doesn’t matter, it could be anything.
Imprecise names are names that don’t describe the object enough. One of the common cases is names with number suffixes. Usually, it happens for three reasons:
In all cases, the solution is to clarify each name.
For the first two cases, try to find something that differentiates the objects, and makes the names more precise.
Consider this example:
Here, we’re sending a sequence of network requests to test a REST API. However, the names response
, response2
, and response3
make the code a bit hard to understand, especially when we use the data returned by one request to create the next one. We could make the names more precise:
Now it’s clear which request data we’re accessing at any time.
For the new version of a module, I’d try to rename the old one to something like ModuleLegacy
instead of naming the new one Module2
or ModuleNew
, and keep using the original name for the new implementation. It’s not always possible but it makes using the old, deprecated, module more awkward than the new, improved, Module2
or ModuleNew
are fine during development though, when the new module isn’t yet fully functional or well tested.
It’s a good idea to use well-known and widely adopted terms for programming and domain concepts instead of inventing something that might be cute or clever but likely will be misunderstood. This is especially problematic for non-native English
A “great” example of this is React codebase where they used “scry” (which means something like peeping into the future through a crystal ball) instead of “find”.
Using different words for the same concept is confusing: a person reading the code may think since the words are different then these things aren’t the same and will try to understand the difference between the two. It will also make the code less greppable (meaning it would be harder to find all usages of the same thing, see Make the code greppable chapter for more).
Idea Having a project dictionary, or even a linter, might be a good idea to avoid using different words for the same things. I use a similar approach for writing this book: I use Textlint terminology plugin to make sure I use the terms consistently and spell them correctly.
Often we create pairs of variables or functions that do the opposite operations or hold values that are on the opposite ends of the range. For example, startServer
/stopServer
, or minWidth
/maxWidth
. When we see one, we expect to see the other, and we expect it to have a certain name because it either sounds natural in English (if one happened to be a native speaker) or has been used by generations of programmers before us.
Some of these common pairs are:
Term | Opposite |
---|---|
add | remove |
begin | end |
create | destroy |
enable | disable |
first | last |
get | set |
increment | decrement |
insert | delete |
lock | unlock |
minimum | maximum |
next | previous |
old | new |
open | close |
read | write |
show | hide |
start | stop |
target | source |
Typos in names and comments are very common. They don’t cause bugs most of the time but could still reduce readability a bit, and code with many typoses look sloppy.
Recently, I found this name in our codebase: depratureDateTime
, and I immediately noticed it because I have a spellchecker enabled in my WebStorm editor:
Spellchecker helps me immensely, as I’m not a native English speaker. It also helps to make the code more greppable: when we search for a certain term, we likely won’t find misspelled occurrences of it.
Often we end up with awkward names for intermediate values, like function parameters or function return values:
Here, the duration
variable is never used as a whole, only as a container for minutes
and seconds
values we use in the code. By using destructuring we could skip the intermediate variable:
Now we could access minutes
and seconds
directly.
Functions with optional parameters grouped in an object are another common example:
Here, options
object is never used as a whole (for example, to pass it to another function), only to access separate properties in it. We could use destructuring to simplify the code:
Here, we’ve removed the options
object, that was used in almost every line of the function body, which made it shorter and more readable.
Often we add intermediate variables to store the result of some operation before passing it somewhere else or returning it from the function. In many cases, this variable is unnecessary.
Consider these two examples:
In both cases, the result
and the data
variables don’t add much to the code. The names aren’t adding new information, and the code is short enough to be inlined:
Here’s another example:
Here, the alias p
replaces a clear name this.props
with an obscure one. Again, inlining makes the code more readable:
Destructuring could be another solution here, see the Use destructuring section above.
Sometimes, intermediate variables can serve as comments, explaining the data they hold, that otherwise might not be clear:
Another good reason to use an intermediate variable is to split a long line of code into multiple lines:
We’ve talked about how to avoid number suffixes by making names more precise. Let’s talk about a few other cases where we may have clashing names, and what can we do to avoid them.
Most often I struggle with clashing names for two reasons:
const isCrocodile = isCrocodile()
).const User = (props: { user: User }) => null
).Let’s start with function return values. Consider this example:
Here, it’s clear which one is the function, and which one is the array with the returned from the function value. Now consider this:
Here, our naming choices are limited:
isCrocodile
is a natural choice but clashes with the function name;crocodile
would mean that this variable holds one item of the crocodiles
array.So, what can we do about it? A few things:
shouldShowGreeting
);isFirstItemCrocodile
or isGreenCrocodile
);isCroc
).All options are somewhat not ideal, though:
I usually use domain-specific names or inlining (for very simple calls used once or twice):
Here, the name describes how the value is used (domain-specific name) — to check whether we need to show a greeting, as opposed to the value
For example, we could decide to greet crocodiles only in the morning:
The name still makes sense, when something like isCroc
would have to be changed.
Unfortunately, I don’t have a good solution for clashing React components and TypeScript types. This usually happens when we create a component to render an object or a certain type:
Though TypeScript allows us to use a type and a value with the same name in the same scope, it makes code confusing.
The only solution I see is renaming either the type or the component. I usually try to rename a component, though it requires some creativity to come up with a name that’s not confusing. For example, names like UserComponent
or UserView
would be confusing because other components don’t have these suffixes. But something like UserProfile
may work in this case:
This only matters when either the type or the component is exported and reused in other places. Local names are more forgiving since they are only used in the same file and the definition is right here.
Start thinking about:
Read other sample chapters of the book:
If you have any feedback, drop me a line at artem@sapegin.ru, @sapegin@mastodon.cloud, @iamsapegin, or open an issue.
Preorder the book now with 20% discount!