I Didn’t Leave ChatGPT Because It Was Bad. I Left Because of How I Think About Tools.

It happened on a Tuesday night, somewhere around midnight, when I was elbow-deep in HookHouse-Pro trying to get a particularly stubborn piece of logic working in TypeScript. I’d been going back and forth with ChatGPT for about forty minutes. The answers weren’t wrong, exactly. They just kept missing the shape of what I was building. I’d explain the context, it would acknowledge the context, and then respond as if the context didn’t exist. On the fifth iteration of that loop, I switched tabs, opened Claude, pasted the same problem, and got a response that actually engaged with the architecture I was describing instead of handing me a generic solution wrapped in confident language.

That wasn’t the moment I switched. But it’s the moment I started paying attention differently.

Here’s the thing about tools: most of us evaluate them wrong. We compare feature lists, benchmark scores, and price tiers. What we don’t do nearly enough is examine the cost of friction. Not price, friction. How many times did you have to re-explain yourself? How often did you get an answer that was technically correct but practically useless? How much of your energy went into managing the conversation instead of solving the actual problem?

I’m a self-taught IT guy with 28 years of breaking things until they work. Most of what I know came from experimenting, making mistakes, and reverse-engineering the wreckage. That background makes me instinctively allergic to tools that require me to constantly babysit the output. A good tool should reduce the cognitive load on the job at hand. When a tool adds to that load, something’s wrong.

ChatGPT isn’t bad software. That needs to be said plainly. For a lot of use cases, it’s genuinely excellent, and the speed of iteration OpenAI has shipped over the past couple of years is impressive. But the way it handles long, complex, multi-layered conversations started to feel like talking to someone who was very eager to help but kept forgetting what we were building. Context drift is the technical term. The practical experience is more like trying to carry a piece of plywood through a narrow hallway with someone who keeps letting go of their end.

Claude holds the hallway. That’s the simplest way I can put it. When I’m working through something intricate, like a multi-file React component structure or a music generation prompt system with several interdependent parameters, Claude stays inside the problem with me. It doesn’t just answer the last question I asked. It remembers what the question was for.

But here’s where I want to be honest about the other side: Claude isn’t the answer to everything. For quick, narrow tasks, ChatGPT is faster and often less verbose. If I need a quick SQL query or a boilerplate function I can modify in thirty seconds, I don’t need a tool that’s going to think at length about the architecture implications. The depth that makes Claude valuable in complex work can actually slow you down when you just need a fast answer and you don’t care about the reasoning behind it.

So what does this actually say about how I evaluate tools? It says context window and context retention are two different things, and I spent too long confusing them. It says I was evaluating AI assistants on their best-case performance instead of their behavior when the work gets genuinely complicated. That’s backwards. A tool’s ceiling matters less than its floor. How does it behave when you’re tired, the problem is messy, and you need something that stays locked in?

For the kind of work I do, whether that’s building real apps in my homelab or wrestling with Exchange edge cases at work, the messy situation is the normal situation. I need tools built for the floor, not the ceiling.

That’s what changed my mind. Not a benchmark. Not a feature. The Tuesday night when the work was hard and one tool stayed in the room with me.

Leave a Reply