Back to Insights
AI & Automation 2026-04-08 4 min read

AI Code Gen: What Actually Works Today

The hype cycle is over. Now what?

Every CTO I talk to in Romandy has tried Copilot, Cursor, or one of the dozen alternatives. Most have formed opinions. Few have measured anything.

I've been tracking AI code generation tool usage across three teams for the past 18 months. Not vibes. Actual data: PR cycle times, bug rates, onboarding speed, developer satisfaction surveys. Here's what I found.

Where AI code gen actually delivers

Boilerplate and glue code

This is the unsexy win. Writing Terraform modules, REST endpoint scaffolding, data class mappings, test fixtures — the stuff that's well-understood but tedious. AI tools cut this work by 40-60% in our measurements. Not because the output is perfect, but because editing a decent draft is faster than typing from scratch.

One concrete example: a team migrating 80+ API endpoints from REST to GraphQL resolvers. What we estimated at 3 weeks took 8 days. The AI-generated resolvers needed review and fixes, but the structural transformation was handled well. That's real time back.

Onboarding acceleration

Junior developers with Copilot-style tools ramp up faster on unfamiliar codebases. They ask the tool "what does this service do" and get a reasonable summary. They generate test cases to understand behavior. It's not replacing mentorship, but it compresses the "staring at code confused" phase.

Test generation

Not perfect. But generating the first pass of unit tests — especially for pure functions and data transformations — saves meaningful time. Our teams report spending 30% less time on test writing. The catch: you need developers who know what good tests look like, or you get a false sense of coverage.

Where it quietly fails

Complex business logic

Anything domain-specific, anything that requires understanding why and not just what — the tools struggle. We build financial software. The AI will happily generate code that looks correct and passes basic tests but handles edge cases wrong in ways that matter. Swiss franc rounding rules. Cross-border tax implications. Regulatory constraints.

The dangerous part: the code looks right. It passes a casual review. This is worse than obviously broken code.

Architecture decisions

AI tools operate at the file and function level. They don't understand your system boundaries, your scaling constraints, or why you chose event sourcing for that one domain. When developers lean on AI for structural decisions, you get locally reasonable code that's globally incoherent.

The hidden review tax

This is the one nobody talks about. AI-generated code still needs review. Often it needs more careful review because the person submitting the PR didn't write every line themselves. We saw PR review times increase by 15-20% in the first six months. Senior engineers were spending more time reading AI-generated code than they saved by not writing it themselves.

We've since adapted — stricter linting, mandatory annotation of AI-generated sections, smaller PRs — but the initial productivity bump was partially eaten by review overhead.

What I tell my teams

Use AI tools like you'd use a junior developer who types fast and never gets tired but has no judgment. Delegate the mechanical work. Never delegate the thinking.

Specific rules we've adopted:

  • AI-generated code gets the same review standard as human code. No exceptions.
  • No AI for security-sensitive code paths. Auth, payment processing, data access controls — written by humans, reviewed by humans.
  • Developers must be able to explain every line they submit, regardless of who or what wrote it.
  • Measure quarterly. Not just "do developers feel productive" but actual cycle times, defect rates, and rework percentages.

The cost question

At roughly $20-40/seat/month for most tools, the licensing cost is trivial. The real cost is cultural: developers who stop thinking deeply because the tool provides easy answers. That's not hypothetical. I've seen it. A senior engineer on one team described it as "the autocomplete trap" — you accept suggestions because they're there, not because they're right.

Fight this actively or it will erode your engineering quality slowly enough that you won't notice until it hurts.

The takeaway

AI code generation tools are genuinely useful for mechanical, well-understood coding tasks. They are not a shortcut to shipping faster on hard problems. Treat them as power tools: valuable in trained hands, dangerous in untrained ones. Measure their impact honestly, set clear boundaries, and never mistake faster typing for faster thinking.

Romandy CTO

Rejoignez la conversation.

Événements mensuels pour CTOs et leaders tech à Genève. Toujours gratuit.