The bottleneck was never in the coding. This statement was true in the age of process, true in the age of people, and it remains true in the current conversation around AI coding assistants — the age of tools.
The promise behind most AI coding assistant adoption is straightforward: developers will write code faster, therefore teams will deliver faster, therefore organisations will get more value from their engineering investment. That chain of logic is not unreasonable. But it rests on an assumption that does not survive much scrutiny — that coding speed is where the constraint actually lives.
In most software teams I work with, it is not. Requirements need clarifying. Stakeholders need aligning. Architectural decisions need making. Code needs reviewing, testing, and deploying. Each of those stages carries its own queue of work, its own variation, its own capacity constraints.
It is fair to acknowledge that AI agents are increasingly capable across more of those stages — not just coding, but code review, test generation, and elements of deployment pipeline automation. That is a meaningful development. But even granting that, the stages that remain most resistant to automation are precisely the ones where the constraint most often lives: the clarity of what is being built, the alignment of the people who need to agree on it, and the organisational processes that govern how decisions get made. Accelerating the technical stages in a system whose constraint is organisational does not accelerate the system. It changes where the queue builds up.
This is not a new insight. It is, in essence, what Goldratt argued in the Theory of Constraints: optimising a non-bottleneck produces local efficiency and system-level noise. What is new is the scale at which the industry is applying this particular optimisation without asking where the actual constraint is first.
My free guide — Insights to Improvement — walks through how to identify quality gaps and build the team habits that close them. Used by agile teams in the UK and LATAM.
Download it free →
A Queueing Theory Lens
I find queueing theory a useful lens here. The notation is simpler than it looks.
A classic M/M/1 queueing system describes:
- M — Markovian (random) arrivals: work arrives at a characterised, if unpredictable, rate
- M — Markovian service times: work gets processed at a characterised, if variable, rate
- 1 — a single server: one resource handling the queue
In plain English: a system where you know roughly how fast work arrives, roughly how fast it gets processed, and can therefore model how the queue behaves over time. A human development team, for all its variation, approximates this. You may not know exactly when the next user story will be completed, but you have enough historical data to characterise the distribution. You can model throughput, predict queue growth, and identify where pressure is building.
Now introduce AI coding assistants — or replace developers with coding agents entirely — and consider what changes. You no longer have a characterised service distribution. You have adoption marketing, benchmark results from controlled conditions, and your own untested assumptions about how the tool will perform on your codebase, with your requirements, under your constraints. The system has moved from something you could model to something you cannot. In queueing theory terms, you have left M/M/1 territory entirely.
That shift produces specific risks worth naming. If the agent’s true throughput is lower than your work arrival rate, the queue will not just grow — it will grow without bound, and you may not realise the system is unstable until the backlog is already unmanageable. A human engineer with understood performance variation has a bounded worst-case scenario. An AI agent may exhibit long-tail behaviour — fast on routine tasks, but prone to getting stuck on edge cases in ways that have no natural ceiling. And if agent performance degrades as queue pressure increases, the system becomes non-stationary: standard modelling assumptions break down entirely.
None of this argues against using AI coding assistants. It argues for understanding your system before you restructure it around a tool you have not yet characterised. The bottleneck was never in the coding. And accelerating the coding, without knowing where the bottleneck actually is, tells you very little about whether delivery will improve.
While the principles discussed here are straightforward, their effective implementation often requires a nuanced understanding of your team’s unique context. That’s where evidence-based coaching makes the difference, accelerating your journey to sustainable productivity. Start with the free guide
Download: Insights to Improvement →
If you’re ready to go further, let’s explore how finding and addressing your team’s real bottleneck can work for your organisation. Reach out today, and let’s make sure the changes you are investing in are pointed at the right constraint.
Discover more from The Software Coach
Subscribe to get the latest posts sent to your email.
