I've Used AI. That's Why I'm Not Excited About This.
What happens when the demo ends and the real codebase begins.
My company just went all in on agentic AI workflows. Full automation: code, tests, the whole pipeline. The Slack channels lit up immediately. Pure excitement, people sharing demos, optimism everywhere you looked.
I get it. The first time you watch it work, it genuinely is impressive as hell.
But here’s the thing, I’ve been using AI as a coding assistant on my own projects for a while now. I wrote about building a real iOS app with it here, so I won’t rehash all of that. The short version: it’s useful, it’s not perfect, and working through the imperfections taught me a lot about what these tools actually are versus what people think they are. So while everyone else was marveling at the demo, I was sitting there thinking “we haven’t found the sharp edges yet”.
This isn’t a hit piece on AI. I think it’s genuinely useful. But there’s a big difference between useful and ready. And right now, I’m not convinced everyone in that Slack channel knows what they don’t know yet.
From Side Projects to the Main Stage
There’s a big difference between experimenting with AI on a side project and dropping it into a large, complex, legacy fintech system with real customers on the other end.
On side work, the stakes are low. If something goes sideways, you fix it, you learn, you move on. But I work on a loan origination and servicing system. We’re talking about a codebase that’s been around long enough to have its own mythology. Every piece of logic in there exists for a reason, sometimes a reason nobody fully remembers anymore. That context matters. The domain knowledge matters. The history of why things were built the way they were matters.
Most of my colleagues are brand new to this space. They haven’t had the chance to work through the rough edges on lower-stakes stuff first. They’re jumping straight from zero to full agentic, and the only frame of reference they have is a demo where everything worked perfectly. That’s a dangerous place to start forming expectations.
The Slack Channel Problem
Here’s what happens in every Slack channel when a shiny new thing lands: the optimists are loud, and the skeptics are quiet.
Nobody wants to be the person who kills the vibe. So the channel fills up with “🔥🔥🔥” reactions and screenshots of AI generating code in seconds, and anyone with a more measured take keeps it to themselves. Over time, that silence starts to look like consensus. And before you know it, you’re treating a tool that’s been in the building for six weeks like it’s a solved problem.
I’m not saying the enthusiasm is wrong. Excitement about new technology is a good thing. It drives adoption, it drives experimentation, it drives progress. But excitement without skepticism is how you end up in a mess you didn’t see coming. The people who’ve actually put time into these tools know they can hallucinate, stray from the task, produce code that looks right but isn’t, and confidently walk you in the wrong direction. That knowledge changes how you work with them. If you skip that learning curve, you’re setting yourself up.
We’re All Orchestrators Now
One of the things I keep hearing is that developers won’t be replaced, we’ll just become orchestrators. We’ll manage the AI, review the output, steer the direction. Sounds reasonable on the surface.
But think about what that actually means over time.
The less code you write, the harder it becomes to evaluate the code being written for you. It’s a feedback loop that works against you. Right now, an experienced developer can look at AI-generated code and spot the problems. The awkward logic, the edge cases it missed, the place where it technically compiles but violates how the rest of the system is supposed to work. That skill exists because of years of writing code, breaking things, and fixing them.
If we hand off the writing entirely, that skill atrophies. And when it does, who’s actually reviewing the output? Another automated process? At some point “human in the loop” becomes a formality, not a safeguard. In a fintech system where a bad calculation can have real consequences for real people, that’s not a theoretical concern, it’s a very practical one.
The LOC Flex That Isn’t
Let’s talk about metrics for a second, because this one drives me insane.
You’ve seen the posts. “AI generated 10,000 lines of code in 30 seconds.” It’s always framed like that’s obviously a good thing, like raw volume is the goal. Spoiler: it isn’t. It never was.
Lines of code has always been a garbage metric. It doesn’t tell you anything meaningful about productivity, quality, or value delivered. The question isn’t whether AI can write 10,000 lines in 30 seconds. The question is whether you needed 10,000 lines to solve the problem. The question is whether those 10,000 lines actually do what the business needs them to do. The question is whether you’d want to maintain that codebase in two years.
Bragging about volume is the AI equivalent of a junior dev who’s proud of a 500-line function. More code is not better code. Coding has always been a means to an end. The real value is in the decision making, the architecture, the problem solving. Syntax is the easy part. It’s always been the easy part. AI being good at the easy part isn’t the revolution it’s being sold as.
Tests That Pass Aren’t Always Tests That Matter
This one keeps me up at night more than anything else.
AI is pretty good at generating tests. The problem is it’s also pretty good at generating tests that pass without actually validating anything meaningful. I’ve seen it produce tests that check whether a function runs without throwing an error and call it coverage. Technically green. Completely useless.
In most systems, that’s annoying. In a fintech system, it’s a genuine liability. If your tests aren’t asserting the right things, if they’re not tied to actual business rules, you’re not testing anything. You have a false sense of safety. If an AI produces bad tests, what’s to stop it from subtly adjusting logic elsewhere to make those bad tests pass? It’s optimizing for green, not for correct. Those aren’t always the same thing.
This is where human judgment can’t be replaced with another automated step. Someone who understands the domain has to look at the assertions and ask whether they actually reflect reality. That requires knowing the business rules. It requires caring about what the code is supposed to do, not just whether it compiles.
Where I Actually Land
I’m not out here saying AI is useless. I use it. It speeds things up. It’s a legitimately good assistant when you keep it focused and stay in the driver’s seat.
But there’s a canyon between “useful assistant” and “autonomous agent running your production pipeline,” and right now we’re treating that canyon like a curb. The tooling is early. The integration patterns are still being figured out. Most teams haven’t had enough time with these tools to develop the instincts for when to trust them and when to push back.
My honest take: we need to slow down, let the tools mature, and build experience before we hand over the keys. Let people get comfortable with AI as an assistant first. Let them find the edges, make the mistakes in lower-stakes environments, and develop judgment. Then talk about agentic workflows.
Maybe I’m wrong. This stuff is moving fast, and I’m planning to keep writing about it as I get deeper into the process. I’m genuinely open to being surprised here.
But “open to being surprised” and “uncritically buying the hype” aren’t the same thing; one of them keeps you out of trouble.
So I want to hear where you’re at with this. Have you been pushing AI into production workflows, or are you still in the assistant phase? And if you’ve gone deeper into agentic territory, what have you learned that the demos didn’t show you?
Drop it in the comments. The more real-world experience we can put in the same room, the better off we all are.
👉 If you enjoy reading this post, feel free to share it with friends!
Or feel free to click the ❤️ button on this post so more people can discover it on Substack 🙏
You can find me on X and Instagram.
Also, I just launched a new YouTube channel - Code & Composure


