¶Undoing indentation hell
Have you ever looked at a piece of code and seen something like this?
if (...) {
if (...) {
...
} else {
if (...) {
...
} else if (...) {
...
} else {
...
}
}
...
I'm going into style matters again and I know this is going to be contentious, but personally, I find that code is hard to read when it looks like a big tree view. You have to mentally match the braces or highlight them individually to have the editor show the matching, and that's hard to do when there are so many of them. It's even harder when the top-level statements don't even fit on one screen. The worst case I ever saw had something like 12 levels of indenting and was more than 17 pages long... and I use a small font size. Most of the time this happens because a routine snowballs until it grows out of control, and no one has the time/experience/constitution to refactor it. This happens to me as well, and eventually I get annoyed enough that I have to perform a nesting exorcism.
I attack such monstrosities by a series of small, incremental transformations. This increases the chances that the code will actually still work by the time I'm done with it. I should note that these are best done only if you own the code or already have to do major work on it for other reasons; doing transformations like this willy-nilly is a good way to completely scramble history in a source code control system and nuke any chances of your team members successfully merging your changes with theirs.
- Flip if/elses so the shortest clause is first.
When if-statements used to check for failure, the failure clause is typically a lot shorter. The usual guideline is to test for success rather than failure, but I find that this is undesirable when the success clause is so long that it pushes the failure clause off the screen. Therefore, I'll often change this:
if (test()) {
// do lots of work
} else {
handle_failure();
}
...to this:
if (!test()) {
handle_failure();
} else {
// do lots of work
}
This then plays into my next transformation, which is to....
- Convert if/else to if/return.
I find that the handle_failure() code above often implicitly or explicitly exits the local scope. Therefore, in a function, I can then transform it as follows:
if (!test()) {
handle_failure();
return false;
}
// do lots of work
This removes an indentation level from the main path. It does require a function scope, which means this is good motivation for hoisting code out to a child function call, another good way to reduce indentation levels. Often simply applying these two transformations over and over is sufficient to completely flatten a routine. This gives a style I like to call "the gauntlet," in which all tests must be passed before execution can proceed:
if (!test1()) {
handle_failure_1();
return false;
}
if (!test2()) {
handle_failure_2();
return false;
}
if (!test3()) {
handle_failure_3();
return false;
}
handle_success();
return true;
A further reduction is sometimes possible when multiple failure conditions lead to identical exit code. I find this happens most often with what I call CYA code, i.e. (s != NULL) and (*s != 0) checks for a (const char *) string argument.
- Convert if/else to if/continue.
If the nested construct is in a loop, returning directly from the failure cases isn't possible without hoisting out the code to a separate function, and that may be difficult if it references a lot of local variables. In that case, the continue statement comes in handy. For some reason, I find that a lot of programmers either don't know about or don't use this statement. It causes a jump to the end of the innermost enclosing loop. I find this useful in constructs that loop over STL containers and only process a subset of the elements:
for(...) {
const T& element = *iter;
if (test(element)) {
...
}
}
Using continue removes one level of indentation:
for(...) {
const T& element = *iter;
if (!test(element))
continue;
...
}
- Merge else{if} to else-if.
This construct:
if (A) {
clause_A();
} else {
if (B) {
clause_B();
}
}
...can be converted to this:
if (A) {
clause_A();
} else if (B) {
clause_B();
}
Again, this saves an indentation level. I'll sometimes avoid doing this, though, if it would destroy the symmetry of the code. For instance, an MPEG-1 decoder's prediction path may have nested horizontal and vertical tests, whose symmetry would be destroyed by the transformation:
if (horizontal_half_pel) {
if (vertical_half_pel)
predict_quarter_pel();
else
predict_horizontal_half_pel();
} else { // !horizontal_half_pel
if (vertical_half_pel)
predict_vertical_half_pel();
else
predict_full_pel();
}
- Convert cleanup pyramids to wrappers or null-based checks.
When working with Windows APIs, it's often the case that you need to make a series of API calls, each of which yields one resource that must be freed. This leads to a dreaded cleanup pyramid:
HDC hdc = GetDC(hwnd);
if (hdc) {
HDC hdc2 = GetCompatibleDC(hdc);
if (hdc2) {
HBITMAP hbm = CreateBitmap(...);
if (hbm) {
...
DeleteObject(hbm);
}
DeleteDC(hdc2);
}
DeleteDC(hdc);
}
Just flattening it leads up ugly cleanup code where each failure case has to clean up the resources for all the previous cases that succeeded. This is error-prone and tedious. The best way I can think of to deal with this is to use wrapper objects, especially for COM reference counted pointers, so that the destructors automatically handle the cleanup. That sometimes isn't practical, especially in the case where you're asked to work in a module and you have no central place you can put wrappers, and don't want to litter each module with new wrappers. (If each successive programmer brings in his own personal utility routines, the result is quite a mess.) In that case, I'll sometimes either use a goto or add a do/while(false) scope that I can break out of, so I can do all the cleanup at the end.
HDC hdc = NULL;
HDC hdc2 = NULL;
HBITMAP hbm = NULL;
hdc = GetDC(hwnd);
if (!hdc)
goto cleanup;
hdc2 = GetCompatibleDC(hdc);
if (!hdc2)
goto cleanup;
hbm = CreateBitmap(...);
if (!hbm)
goto cleanup;
...
cleanup:
if (hbm)
DeleteObject(hbm);
if (hdc2)
DeleteDC(hdc2);
if (hdc)
DeleteDC(hdc);
- Use a success/failure bool.
If the control flow is really wacky, sometimes I'll resort to using a single bool to control most of the flow, so that the top-level blocks go like this:
bool success = section_1();
if (success) {
success = section_2();
}
if (success) {
success = section_3();
}
It's more verbose and leads to more bool checks, but it does allow nested sections to be hoisted out and also allows for arbitrary code at the tail end without requiring too much rework. More importantly, it also allows for breakpoints and log/assert statements to be inserted at each location. That isn't possible when people use this style:
success = success && function();
...which I find a pain to debug.