Often talked about practices of improving engineering efficiency fails to create real value on the ground because they often miss the human aspects that limit engineering efficiency.
We often hear about “context switch”, however, we hardly ever pay serious attention to it. Like a linter warning you about the cyclomatic complexity of your code, this post aims to warn you about the complexities hiding within your organisation causing long term harm.
What is cognitive load?
Cognitive load, coined by Joh Sweller, is described as "the total amount of mental effort being used in the working memory." Cognitive load can be of three different types:
Intrinsic Cognitive Load - load generated while doing your core job - how easy is it for you to write code?
Extraneous Cognitive Load - load related to managing dependencies of your core job - how do I set up my service? How do I test my service? Who do I need to speak to get this working?
Germane Cognitive Load - load related to deep thinking - What should be the architecture of my system? What investments do we need to make for the future?
Like in software, the working memory of the brain is limited. To make teams and software systems effective in the long term we need to address the issue of cognitive load. Failing to do so risks not just employee burn out but also poor decision making & lack of deep thinking eventually resulting in poorly designed and implemented software systems. The usual argument, should a startup start with a monolith or a microservices architecture can hence be answered by asking, ‘can your engineers handle the cognitive load?’. Uber’s recent story of microservices is an example of the cognitive load becoming unmanageable. [Edit: Uber has taken down this page and but you can read it via Google’s cache here]
In their book Team Topologies, Manuel Pais and Matthew Skelton suggest that
For effective delivery and operations of modern software systems, organisations should
Minimise intrinsic cognitive load (through training, good choice of technologies, hiring, pair programming, etc.)
Eliminate extraneous cognitive load (boring or superfluous tasks or commands that add little value to retain in the working memory and can often be automated away).
Doing so allows more space for Germane cognitive load (which is where the “value add” thinking lies).
Load Creep
However, most teams do not realise that they are suffering from cognitive load because it often creeps up on them over time. Teams or individuals having high cognitive load eventually reach a point where they cannot deliver effectively. Cognitive load can come in many ways, most visible themes are:
A single large team responsible for multiple projects with individuals rotating between projects to keep everyone hands-on on all codebases the team is responsible for. Primarily done to keep redundancy
A single individual working on a complex project but also taking responsibility for critical side projects
A high number of ad hoc requests/bugs/alerts coming to a team, leading to constant context switch
A single project evolved into a large codebase and everyone expected to know the whole codebase end to end
A team starts with a small number of services but the service list grows over time. Teams are typically required to fix and maintain existing services while working on newer services
Individuals are picked from different teams to build a service and then the service is handed over to another team to operate with no additional bandwidth provided to that team
Increasing responsibilities of a team - E.g. DevOps start with managing infra, then build systems, then observability systems, then infra automation, then CI/CD, then security etc
Highly coupled code in a monolith of a large domain
Communication and collaboration paths aren't well defined for fast flow resulting in high dependency between teams resulting in frequent blockers in execution
This typically results in the following:
Sprint planning becomes complicated with a mix and match of requests across the stack of responsibilities
Prioritisation becomes hard
Lack of bandwidth to pursue mastery of their craft since the team struggles with context switches
Engineers constantly feeling overworked, while business is unhappy with the pace of innovation
Startups often suffer from these because the investments in organisational growth are not matched by investments in human capital, training, automation, documentation etc.. Primary reasons for this are:
Need for high-quality engineering talent which takes longer to hire
Need for product-market fit supersedes the need for investments in DevOps practices
Lack of an anti-product roadmap
Limited capital for over the top expenses like training, hiring additional people etc.
All of these are valid concerns of a growing startup, however, acting on it at the right time is key to maintaining a stable and healthy engineering organisation.
Prevention
Work in your org can typically be classified as:
Simple (most work has a clear path of execution)
Complicated (solution requires analysis and a few iterations to get right)
Complex (solutions require a lot of experimentation and discovery)
Recognising the type of work that each team is engaged in should give you an indication of how much you can load up a team.
Preventing cognitive load requires a change in organisational structure, prioritisation strategies, communication patterns and continuous investment in DevOps practices. Understanding the complexity of work that is possible to manage by a team should drive organisational structure and drive software sub-system boundaries.
Do you and your org suffer from cognitive load?
Credits: The content of this post is inspired by the book Team Topologies. The book suggests excellent mechanisms to create the right organisational structure to solve for cognitive load and fast flow.
Reach me: Twitter
Like this post? Subscribe now. We write a
In the meantime, tell your friends!
Pretty good. Concept and framework make sense for non software teams as well.