Code comprehension: Chunks and Beacons

We talk about making source code readable and it turns out that this isn’t just stylistic interpretation; there is actual research into this topic. As we read through the code, what we’re scanning for are chunks and beacons. Chunks and beacons are similar in that they’re both code fragments, yet they’re used differently.

“Chunks are described as code fragments in programs. Available literature shows Chunks to be used during the bottom-up approach of software comprehension. Chunks vary in size. Several Chunks can be combined into larger Chunks.”¹

A looping structure could be a chunk, contained inside a method which is also a chunk, contained inside a class which is also a chunk.

You’ve probably heard that our working memory can retain 7±2 (5-9) chunks of information at any one time². Our ability to chunk the source code allows us to retain more of the concepts in working memory at once.

Experts are able to chunk more effectively than novices based on patterns that they’ve become familiar with. Even novices can benefit from structural chunking, however.

“Beacons are code fragments that help developers comprehend programs. It has been shown that expert programmers pay more attention to Beacons than novices.”¹

Beacons are often used to confirm hypotheses about the code. We start with a general hypothesis about what the code does, from documentation or conversation or perhaps naming. We then look to these beacons to confirm our understanding of this. Do we find the things that our hypothesis would expect us to find? If looking at a specific part of the code confirmed something you were thinking, that was a beacon.

Beacons make it easier to chunk the code as they draw our attention.

When reading through the code, an expert will recall the beacons far more easily than the non-beacons. Novices and intermediates will recall the non-beacons better³, presumably because they were scanning line by line.

When programmers know that specific design patterns are present in the code then those patterns become useful beacons.⁴ These don’t have to be named patterns that we would find in a catalog; they could be a swap operation in a sorting algorithm.

Names in the code are important beacons⁴. Ensuring that we have good variable names, will allow an expert to scan the code more effectively.

How can we use this information to make our code more readable?

Use better naming for variables, methods, classes, etc.
Use commonly understood patterns in the code. These don’t have to be named patterns like Command or Strategy. They could be patterns such as “always keep these two pieces of information together”.
Make careful use of comments.
- Beginners use comments for chunking more than experts do, so if you’re optimizing the codebase for beginners, you may want more comments. If you’re optimizing for experts, you want less.
- High level comments expressing intention of a section can help chunking.
- Detailed comments explaining exactly what a line does, makes chunking harder, so avoid these.
Smaller methods are generally more chunkable, although if they’re too small then they can make that worse.

Aschwanden, Christoph and Martha E. Crosby. “Code Scanning Patterns in Program Comprehension” (2005). Note that this is a draft and I’ve been unable to find a final published version of it. ↩ ↩²
Miller, G. A. (1956). The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychological Review, 63(2), 81–97. doi:10.1037/h0043158 ↩
Wiedenbeck, S. (1986). Beacons in computer program comprehension. International Journal of Man-Machine Studies, 25(6), 697–709. doi:10.1016/s0020-7373(86)80083-9 ↩
The Programmer’s Brain: What every programmer needs to know about cognition by Felienne Hermans ↩ ↩²