You point agents at your repo. They read every file. Not skimming. Reading. The full dependency graph, the test suite, the auth layer, the dead corners nobody’s opened in months.
What they find is never a surprise. It’s the stuff you already know about.
The shape of first contact
Every codebase has a version of the same problems. The specifics change. The shape doesn’t.
The orphans. A feature got cut. The route is gone. But the model still exists. So does the helper that formatted its output, the test fixture that seeded its data, and the migration that created its table. Nobody deleted them because nobody remembered they were connected. Agents trace the full graph. They see what’s connected to nothing.
The drift. Two parts of the system decide the same thing independently. They start identical. They evolve separately. They diverge. It happens faster than anyone tracks.
In our own codebase: spawn outcomes are classified in two places. The API classifies when a user requests the result. The background daemon classifies when the spawn completes. They started as the same logic. Months later: the API correctly flagged auth failures as errors. The daemon kept marking them complete. The success rate looked healthy. The data disagreed with reality.
It wasn’t in the backlog. It didn’t look like a bug. It looked like two things that happened to do similar things. We saw the same drift when we ran on an external codebase. Different stack, identical pattern.
The illusion. A test file with dozens of assertions. Good coverage numbers. But look closer: half the tests check that a function returns without throwing. No boundary inputs. No edge cases. The suite is a green badge, not a safety net. By end of session, the boundary cases are written.
The shortcuts. The API route was supposed to call a service. A deadline meant it called the database directly. Then a second route copied the pattern. Then a third. Now what should be a simple feature change requires tracing four levels of state. Agents read the dependency graph in one pass. By the time you’d open the third file, they’re already in the fourth.
A consultant gives you a report. You read it, prioritize it, schedule the fixes, and start the work in two weeks if you’re disciplined.
Agents don’t produce reports. They produce diffs.
By morning, the orphaned code is gone. The auth checks are consolidated. The test gaps have cases. The layering violation has a refactor in progress. The reading is the first hour of the fix.
That’s what “first session” means. Not a scan. Not an assessment. Work.