Just How Good Is AI At Writing Code

John O'NeillBlog

7 Jan

There is a huge amount of hype surrounding AI and the possibilities it offers, so it’s natural to ask: just how good is AI, really?

Unfortunately for me, I’m old enough to remember the dot-com era. I worked for a company whose name ended in “.com” — an extension that was quietly dropped immediately after the crash. History, as they say, is a useful guide to the future, so it’s reasonable to ask whether we should expect an AI crash at some point.

The answer is almost certainly yes.

The hype today is every bit as frenzied and overblown as it was in the late 1990s. But, just as importantly, we should also remember what happened after the dot-com crash. The internet didn’t disappear. Instead, it reshaped almost everything we do. Today, without the internet and e-commerce, everyday life would be almost impossible.

AI will likely follow the same path. There will be a correction — a return to more realistic expectations — but AI is undeniably here to stay. The real challenge is understanding where it adds value, and where it doesn’t.

From Handcrafted Code to AI-Accelerated Development

Even just 18 months ago, we looked like a typical software development company: carefully handcrafting code to meet the needs of our customers. Anyone in that business knows how challenging and risky that can be. Building software is nothing like building a house. Every project is different, inherently complex, and full of hidden traps waiting to erode your margins.

Yes, we built reusable components, and that helped — but the underlying risk never went away.

Fast forward to 2026 and the landscape looks completely different. Tools like Claude from Anthropic have advanced so rapidly that entire solutions can now be built in hours that previously would have taken weeks or months. On first encounter, it’s tempting to conclude that developers are no longer required at all.

That conclusion would be a mistake.

It’s true that you may not need as many developers, but it’s critical to understand where human expertise still adds enormous value — and what the hidden dangers are of generating production systems from a single AI prompt. Many businesses are about to discover the consequences of getting this wrong.

A Lesson from End-User Computing

Once again, history provides a useful parallel.

One of the earliest forms of “end-user computing” was self-service reporting. A modern example tool would be Power BI. These tools were genuinely transformative, freeing businesses from constant dependence on IT teams to produce reports.

For a small number of reports, this works brilliantly.

But as the number of reports grows, design and architecture — or the lack of them — start to matter a great deal. We’ve won many projects cleaning up fragile, tangled reporting solutions left behind when a single individual exited the business, taking all the knowledge with them and leaving a support nightmare in their wake.

Today, no sensible large organisation would build reports directly on top of an operational or transactional database. Some form of abstraction — a data warehouse or data mart — is considered basic good practice. Without it, the solution becomes too complex, fragile, and risky.

Low-Code: A Familiar Pattern

Low-code tools followed a similar trajectory.

They weren’t bad tools — quite the opposite. We built our own internal low-code platform to deliver real business functionality faster. Used correctly, they can be incredibly powerful.

But in the wrong hands, they are dangerous.

I’ve seen low-code implementations that completely ignored rollback strategies, security, testing, quality assurance, failure recovery, and scalability. These tools work well for small, contained solutions, but without proper software engineering discipline they quickly deteriorate into expensive maintenance and support problems.

So the obvious question becomes: what is the equivalent best practice for AI-driven software development?

The Real Problem: AI Is Too Good

Ironically, the problem with AI code generation is that it’s too good.

Watching AI generate code can feel like watching a news anchor effortlessly deliver expert commentary — until you remember they’re reading from a teleprompter. Once you understand that, the illusion fades.

The same is true of AI-generated code. It’s easy to be so impressed by what it produces that we mistake fluency for understanding and start attributing human judgment and intent where none exists. At its core, AI is the most powerful “next-token predictor” ever built.

That doesn’t diminish how remarkable it is — but it does mean we need to understand where to trust it, and where not to.

AI Has the Same Frailties As We Do

AI is trained on vast amounts of human-generated content and refined through human feedback. Unsurprisingly, it inherits many of the same weaknesses we have.

In that sense, it behaves a lot like the human subconscious. When you have deep experience, you can usually trust your intuition to pattern-match good solutions. When you don’t, you fall back on bias, overconfidence, and imagination.

AI does exactly the same thing.

Ask it to write code in a language with 30 years of history and it will often do an excellent job. Ask it to work in a very new or poorly documented ecosystem and it has little choice but to hallucinate. I’ve generated entire solutions from a single prompt that looked astonishingly good on the surface — until you dig into the details and find subtle, deeply buried mistakes that would cause serious problems later on.

Single-shot code generation is fantastic for rapid prototyping. But if you want software that lasts, it’s rarely the right approach.

Early in my own career, I wrote large programs as single monolithic blocks. Over time, I learned about abstraction and reusability using classes, service layers, separation of concerns, and the many practices that allow software to evolve without collapsing under its own weight.

So What Does This Mean for AI-Written Code?

It means that analysis, architecture, and design are as important as they’ve ever been.

The difference is that the building blocks are changing. Sub-agents begin to resemble the new “classes,” and context engineering starts to replace traditional software engineering techniques.

AI hasn’t removed the need for thoughtful design — it has amplified the cost of getting it wrong.

In a future article, we’ll explore what modern best practice of context engineering looks like in this new world, and how teams can use AI safely, sustainably, and to maximum advantage.

Thinking About Using AI in Your Software Projects?

John O'Neill

Just How Good Is AI At Writing Code

What Are Knowledge Agents?

Contact