Optimize for cognitive load

I recently read a rather interesting post by Martin Fowler regarding function length, where he suggested that very small functions that encompass the implementation for a single intention are ideal. I have a somewhat different view of this argument, which also happens to touch on larger concerns of software design and even, to a certain extent, architecture. It is a holistic view, in the sense that the same goal is desired at multiple levels, from the function to the entire system.

Specifically, I argue that instead of optimizing for specific low-level ideals such as function length, implementation vs. intention, dogmatic adherence to patterns or practices and so on, we should optimize for cognitive load. Let me start by explaining the general concept of cognitive load and how I interpret it.

Cognitive Load Theory

Cognitive Load Theory (CLT) was developed by John Sweller, an Australian professor specializing in educational psychology. According to CLT, when humans are learning, there are three types of cognitive load that occur: intrinsic, extraneous, and germane. Intrinsic cognitive load is effectively the difficulty that the topic being learned presents by itself. Not much can be done to affect that if the topic is to be learned. Extraneous cognitive load is, as the name suggests, considered unnecessary. It is created by the manner in which information is presented. For example, the extraneous cognitive load would be much higher if I were to verbally describe a geometric shape like a square, rather than just show a picture of one. Finally, germane cognitive load is involved in processing and construction of schemata. In the context of cognitive science, a schema is basically a grouping of learned information or a common pattern of processing such information.

The idea of applying CLT to UX has been growing in popularity in recent years. Numerous articles have been written about optimizing visual and interaction design to reduce extraneous cognitive load. Unfortunately, I've not seen much talk about this in regards to software design or architecture. It seems, we software engineers tend to focus more on the technical and less on the human side. But we need to take both into account when working on real (i.e., not personal/toy) projects if we want to increase maintainability and ease of development.

Application of CLT

My interpretation of cognitive load as it applies to software design is rooted in how many steps you must go through to understand the code involved in executing an API call, a workflow, a use case. This is, of course, a multifaceted problem with no clear generic solution. In most situations, in order to understand a particular flow to a sufficient degree, you don't need to know all the minutiae involved in it. For example, if you're working on a RESTful API, you rarely (hopefully never) need to debug down to the level of TCP connections or Ethernet frames. Often, you don't even need to know the exact processes your framework of choice uses to translate an HTTP request to an appropriate function call in your code. And, depending on what aspect of the codebase you're trying to understand, you can often skip other important code in order to focus on what currently matters to you.

Functions

So how does all this affect software design? Let's start with function length and go up from there. From a CLT perspective, a very long function that encompasses a significant amount of data processing will have high extraneous cognitive load when you analyze it because it will likely deal with multiple states that you have to keep in mind at all times while mentally processing the various permutations of conditionals and loops, what happens inside each of them, and how those previous decisions affect further conditionals and loops later on. This is a lot of information to keep track of, and so is inefficient for us to process.

At the same time, very small functions will also have high extraneous cognitive load. The above load of dealing with state is replaced with the load of incessant context switching when you have to look at many different functions, then back again down a stack, then forward again from the next step, and so forth. This causes the same problem of presenting too much information to keep track of, and so is still inefficient.

The ideal function length is somewhere inbetween. I hesitate to give concrete numbers, since there are multiple conflicting models of human working memory, which have different implications for how many items we can hold in our minds while working on a problem. Instead, I'd suggest that you rely on your intuition to help determine the right balance. Look at examples of very long functions and of sets of very short ones, and try to analyze the flow through them. Seeing the issues with both by looking at examples at each extreme will help you to find a balance.

Of course, there are other factors that play into your ability to analyze code. Descriptive function names, for example, are very important, as is a well thought out hierarchical (class/file/project) separation.

Design and architecture

Speaking of hierarchical separation, this can affect cognitive load in a different way: well-designed separation can improve germane cognitive load. If you're working on a well-designed codebase for a significant amount of time, your mind will use schemata to quickly guide you to the correct project or directory or file "without thinking". You are probably familiar with this phenomenon already: working on such codebases will let you quickly find the location of some code in question even if you're not sure where it is precisely, because you're familiar with the overall design of the system.

Conversely, codebases that are not well-designed will hamper your ability to find code whose precise location you don't already remember. This can be attributed to the inability to form a cohesive schema related to this codebase, since code is haphazardly separated without a clear hierarchy or other organizational means.

Architectural and design patterns will often help to organize code in a way that we can process more easily, but we must be careful not to apply too many such patterns or apply them improperly to avoid confusion. The use of well-known patterns enhances our ability to process and understand a codebase because we have already developed (or can begin to develop) schemata to deal with these patterns.

Bringing everything together

All this human-centric discussion doesn't negate technical needs. Certain choices must be made for technical reasons, and sometimes these choices will make part of a codebase more difficult to analyze. As always, a balance must be struck. Modern compilers and interpreters are extraordinarily adept at optimizing code for execution performance, so low level optimizations are rarely needed these days. Technical needs will most often be expressed at higher levels. As an example, when system extensibility is required, certain architectural and design decisions must be made to support this requirement. Unfortunately, these decisions may lead to worsened readability, but you don't always have a great way to balance out system needs with human analysis needs.

I urge you to keep the human factors dicussed here in mind when performing any task from the writing of functions to the design of systems. While different goals may take precedence at different times, simply keeping these concerns in mind will allow you to create better software.

 
comments powered by Disqus