At this point, it's clear that AI isn't going anywhere. Like a runaway diesel engine, the hype machine is still running at full throttle, even though we’re all choking on the black soot of disillusionment and an inherent fear that we’re losing trust in what we see and hear.
Earlier this year, I wrote about my concern that GenAI is a bubble that's about to burst. Like the dot-com days before it, however, that doesn’t mean the concept isn’t valuable; it just means investors have all gotten a little too excited.
But behind all the hype and fear-mongering, the simple truth is that large language models have been an effective force multiplier in several use cases and they’ll have a long-lasting impact on how we interact with technology.
In this article, I want to give practical advice on where it makes sense to deploy AI in your Internal Developer Platform for 2025.
My Robot Army: More Roomba than Terminator
I bought my first robotic vacuum cleaner in 2020, and it quickly became one of my favourite appliances … if you can have such a thing. With three kids and a busy household, “Jenkins” spins into action every night to help keep our house clean.
I also own robotic lawnmowers, and they do a phenomenal job, but I love them a little less. You see, I actually quite enjoy the catharsis of mowing the lawn! But, it is time consuming and I recognise that my little robot army enables me to crack on with other jobs - albeit ones that I don’t like as much.
There are many parts of my life where I don’t want any help, though. Amongst other things, I don’t want robots to walk my dog, drive my car, or entertain my kids.
Yet, when I speak with IT leaders about artificial intelligence in the workplace, all I seem to get asked is what LLM is capable of producing the best code. The problem is that writing good software is a beautiful symphony of interpretation, logic, creativity, and iteration. You might get something by guessing the next word from a catalogue of all the code snippets ever published, but it will never be great.
I’m also pretty sure developers enjoy writing code - it’s why they trained to become developers. Sure, they might appreciate some help with it, but I know there are a bunch of other jobs they don’t enjoy that we should maybe automate first!
So, my first and most important lesson is that you should look at AI as a tool. We should use tools to help us spend less time doing the things we don’t enjoy, or that we’re not very good at.
Example #1: Why Did My Build Fail?
Ever scrolled through thousands of lines of verbose CI pipeline output, trying to figure out why a build failed? You lean into the screen, quickly scanning the text for anything that doesn’t look right. After a while, you eventually find an error message, but it’s not very helpful, so you then spend the next ten minutes trying to work out what went wrong.
The most annoying thing about build failures is that they’re usually caused by simple, silly, mistakes: a missed semicolon from a Dockerfile; an extra space in a YAML file; or just a simple typo. Of course, just because they’re easy to fix, doesn’t make them easy to find!
This is exactly the kind of task where AI shines. It can scan through verbose logs in milliseconds and identify anomalies in well structured text very easily. My colleagues Rutvik and Michael gave a great example of GitLab’s Root Cause Analysis feature, which saves developers and engineers wasted time. In many of the organisations I work with, this one feature alone was able to single-handedly demonstrate the ROI for adding AI features to their subscription.
Example #2: Nobody Likes Writing Unit Tests
Developers may love writing code, but I’m yet to meet anyone that enjoys writing unit tests! You are writing unit tests for all your code, right … ?
A unit test is a small piece of code that checks if a specific part of your program, like a function or method, works correctly in isolation. This not only saves you time in the long run by catching bugs sooner, but also helps you understand and verify the logic of your code better, to improve overall code quality and reliability.
Much like summarising a long report, Large Language Models (LLMs) are particularly adept at writing unit tests. They can quickly understand the structure and purpose of existing code, which enables them to create relevant tests without needing to write new logic from scratch. The test cases themselves are very simple by nature, and follow a consistent style.
Humans generally find simple, and repetitive tasks like this very boring. We either avoid doing the task, or don’t do it very well. In a recent article, GitLab gave a sneak peek of how they use GitLab Duo Chat internally to write unit tests.
Example #3: Code Reviews Without The Bottleneck
Another area that often lacks focus is Code Reviews. Reading someone else’s code just isn’t as mentally engaging as writing your own, and we’ve already spoken about how bad humans are at spotting small mistakes. A good code review needs someone very experienced, that is familiar with the part of the code base that’s being changed. Those individuals are often in short supply, and generally have a long list of other things they need to do.
Retrieval Augmented Generation (RAG) is a technique that allows AI to pull in relevant context from your existing documentation when performing tasks. In this case, you might give the LLM visibility of your company policies, best practices, naming standards, or other project-specific documentation before asking it to review code.
I don’t think AI can provide all the valuable insights a junior developer might get from a human review, but it seems obvious that they could provide a “first pass” to help reduce the workload for senior developers.
Good Practice as a Foundation for AI
So before you get too excited about AI generating your next microservice, it's worth remembering that even the fanciest robot vacuum cleaner won't help if your house is in chaos. The same principle applies to software development - AI tools are most effective when built on top of solid engineering practices.
The examples I've shared aren't about replacing developers or magically creating new applications. Instead, they focus on strengthening the foundational practices that make good software development possible: reliable builds, comprehensive testing, and thorough code reviews.
If your CI/CD pipeline is unreliable, test coverage is patchy, or you lack a robust code review process, then AI code generation is just going to amplify these problems (and more!).
In 2025, the real power of AI in software development isn't about replacing developers - it's about making good development practices more achievable and sustainable. And that is something worth getting excited about! 🥳
I'm interested in reading what you think platform teams should do to support developers using AI. Hosting and broker models? Handling whatever security and data handling is needed? When I think about it, it's pretty much the same things as any service, but there's probably some AI-unique things.