This month, the CNCF Content Club initiative are sharing their thoughts on the topic of “Metrics That Matter”. If you’d like to have your say, pop by and say hello on the the #content-club CNCF Slack channel.
Other contributions in this series that you might like include:
Kalle Sirkesalo takes a walk and shares his journey with metrics, from discovering how easily bonus-tied metrics get gamed, to learning how to set effective metrics that influence desired behaviours.
Graziano Casto shares his view against blindly adopting frameworks like DORA in favour of tailoring metrics to outcomes specific to your business.
"What gets measured, gets managed" is one of those business quotes that's both painfully obvious and profoundly important. Every platform team I've worked with eventually faces the same question from leadership: "How do we know if this is working?"
It's a fair question. Platform Engineering is balanced precariously at the top of the Gartner hype curve and leadership is understandably keen to see evidence that all their investment hasn’t gone to waste.
But what does “better” even look like?
“Can you say DevOps?”
The first challenge I see teams have is that they often don’t know what to measure, so they begin to research what other companies have done. The most common metrics framework I see organisations adopt is DORA. Developed by the DevOps Research and Assessment Team and popularised by the release of the book Accelerate in 2018, it distills a large body of research into four prescriptive metrics:
Deployment Frequency - frequent, smaller deployments are often less risky and beneficial for business agility.
Lead Time for Changes - how long it takes a code commit to reach production.
Change Failure Rate - the percentage of changes that result in a production failure.
Mean Time To Recover - the time taken to recover from a production failure.
What makes DORA particularly valuable is the balance it strikes between speed (the first two metrics) and stability (the second two). Contrary to what those in Silicon Valley might tell you, the goal for most companies is to “move faster, without breaking things”.
Companies gravitate towards DORA because it provides a sensible baseline for “performance” in a minimal set of prescriptive metrics that are easy to collect. Contrast this with more nuanced frameworks like SPACE or DX Core 4 that require deeper consideration of what success looks like, and include metrics that are a little more challenging to collect.
The DORA Disconnect
I've lost count of how many organisations I've seen enthusiastically adopt DORA, only to misunderstand its purpose and constraints.
At their core, DORA metrics are a deliberately simplified lens focused on system delivery performance. They don’t account for team context, product complexity, or technical debt. Deploying a change to a website and rolling out new firmware to ten million set-top boxes are very different propositions. Likewise, DORA doesn’t measure human factors such as developer experience or cognitive load. I've met plenty of "high-performing" teams who were miserable, burning out, and planning their exit interviews.
In my experience, organisations make two fundamental mistakes when adopting DORA:
The first is that correlation is not causation; organisations who are considered "high performing" in the research are not high performing because they focussed on the DORA metrics, but instead focussed on delivering value to users.
Although Goodhart’s Law taught us many years ago that “when a measure becomes a target, it ceases to be a good measure”, it hasn’t stopped countless leadership teams weaponising DORA as a means to compare performance across teams.
Both relate to the “cargo cult” obsession we have in the IT world of emulating the behaviour of others, irrespective of context. Your company does not share many traits or challenges with Facebook. It took Netflix over seven years to migrate to the cloud and still had private data centres up until 2016. Amazon made 90% cost reductions in video quality monitoring by moving from a serverless architecture back to a monolith.
Measure What Matters
When I’m asked to consult on the topic of metrics, my first question to the team is “what is the behaviour you want to influence?”
There should be a purpose for metrics, beyond idle curiosity. Metrics should be chosen deliberately to reinforce your platform's goals, not just because they're easy to measure or everyone else is using them. If your goal is to improve developer experience, measure that as directly as possible.
Borrow ideas from frameworks like DORA, but don't copy them wholesale. Take what's useful, discard what isn't, and adapt everything to your specific context. Blend hard metrics with human metrics. Deployment frequency matters, but so does developer satisfaction. The best platform teams I've worked with survey their users regularly and treat that feedback as seriously as they do technical metrics.
Start small and iterate. Begin with just 2-3 metrics that directly relate to your most pressing concerns. Add more only when you've established a healthy relationship with the initial set.
Be transparent about what you're measuring, why you're measuring it, and what the results mean. When teams understand the purpose behind metrics, they're less likely to game them and more likely to use them constructively. Equally though, understand that all metrics eventually become a target of sorts, explicitly or subliminally; it’s one of the reasons why simply measuring something regularly can help you manage it better.
Beyond The Numbers
The most successful platform teams I've worked with recognise that metrics are tools, not outcomes. They use data to inform decisions, not dictate them. They balance quantitative measures with qualitative feedback.
A platform's success isn't found in a dashboard or spreadsheet—it's found in the experiences of the people who use it every day. Measure what matters, and create the space for success that can't be captured in a spreadsheet.