Hand the same AI coding assistant to two engineers and you get opposite results. The weaker one ships low-quality work faster than before. The stronger one ships high-quality work faster than they ever could alone. Same model, same prompt box, same autocomplete. Opposite outcomes. The tool did not raise or lower anybody's standard. It moved each of them further along a trajectory they were already on, and that trajectory was set long before the tool showed up.

This is the part the discourse keeps getting wrong. AI tools amplify the engineer using them. They do not set the quality bar. The person does. A model will hand you something plausible every single time you ask, on demand, in seconds. Whether that plausible thing is actually correct is a separate question, and answering it is still entirely human work. The speed changed. The standard did not. The standard is whatever the person at the keyboard can recognize as good, and no model has figured out how to install that.

Which is why most of the energy going into learning the tool is aimed at the wrong target. Prompt phrasing, the new model's parameters, the keyboard shortcuts for the assistant, all of it has a shelf life measured in months. The model changes, the interface changes, and the trick that worked last quarter is baked into the default behavior this quarter. What does not expire is the ability to look at a block of generated code and know, quickly and correctly, whether it belongs in the system. That skill took years to build and it transfers to every tool I will ever touch. Optimizing for prompt-craft over judgment means polishing the thing that depreciates and neglecting the thing that compounds.

The failure mode this creates at the low end is genuinely new, and it is not that beginners write bad code. Beginners always wrote bad code. The new thing is that they now write plausible bad code at a rate no reviewer can keep up with. Generated output looks finished. It has the shape of correct code, the right idioms, confident names, a tidy docstring. It compiles. It passes the happy-path case. And it is wrong in a way that takes real expertise and real time to see, buried under three functions that all look reasonable. When the person producing it cannot tell, and the volume is ten times what it used to be, the wrongness ships. The bottleneck moves from writing to reviewing, and reviewing is the harder skill the tool does nothing to give you.

When I review AI output, judgment is not a vague sense of taste. It is concrete and a little tedious. The first question is whether the approach is the one I would have chosen or merely one that happens to work, because those are not the same and the model only guarantees the second. Then there is the edge case it skipped, usually because I never mentioned it, since the model answers the question I asked rather than the one I should have asked. Somewhere in the diff it has often reached for a dependency I do not want, solved a more general problem than I actually have, or quietly changed behavior in a place I was not looking. Underneath all of it sits the only question that really matters, which is whether I understand every line well enough to defend it. The moment I ship something I cannot explain, I have outsourced my own competence to a system that has none of its own.

People assumed these tools would flatten the field, that the floor would rise and the gap between strong and weak engineers would close. The opposite is happening. A strong engineer points the tool at judgment they already have, so the output gets faster and stays high. A weak engineer points it at output they cannot evaluate, so the work gets faster and the errors compound. The same amplification applied to more skill and to less skill drives the two further apart. The tool rewards what you already are and it rewards it at scale, which means the returns on actually being good went up, not down.

I will say the uncomfortable part plainly. These tools have made me faster. They have not made me a better engineer. The judgment I use to catch a subtly wrong result is the same judgment I had before, sharpened by the same thing that always sharpened it, which is doing the work, being wrong, and noticing. The model did not give me that and it cannot. So the investment that still pays off is the one that always paid off. Get better at the actual craft. Read the hard code, understand the system a layer deeper, learn why the wrong answer is wrong. The tool will multiply whatever you bring to it, and that puts the whole question back where it started, on you.