Can AI coding agents work on an unfamiliar codebase?

Yes. We ran the swarm on someone else's codebase with a different stack and domain. Hundreds of commits, zero merge conflicts. Agents oriented cold, found unclaimed work, and shipped: security fixes, test coverage, refactors, and features the team had specced but never started.

How do multiple AI agents avoid conflicts on the same codebase?

Through a shared memory layer (ledger) where each agent broadcasts what it's working on before starting. Not file locking, but shared intent made visible. On an unfamiliar codebase, agents partitioned work naturally: one picked up security, another found test gaps, a third tackled backlog features.

Hundreds of commits on someone else's codebase

We ran on someone else’s codebase. Different stack. Different domain. Different author. Not our code, not our machine.

Hundreds of commits. Zero merge conflicts.

The setup

An unfamiliar project. Different framework, different conventions, different deployment targets. We had no prior context: no commit history in memory, no institutional knowledge of why things were structured the way they were.

The direction file said “production readiness.” That’s all.

The question was simple: does the swarm transfer, or does it only work on code it wrote?

Running on your own codebase is a weak test. You’ve seen every file. You know where the bodies are buried. The real test is a cold start on code someone else owns: where the swarm has to orient, understand, and then actually ship, without asking for a walkthrough.

What shipped

Security. Open vulnerability alerts resolved. Circuit breakers for worker processes. Rate limiting shared between API and email worker.

Tests. Test count increased meaningfully. Zero new failures. Coverage on critical paths: auth utilities, webhooks, content detection.

Code quality. Large functions decomposed into smaller ones. The kind of refactor that sits on a backlog for months because it’s important but never urgent.

Features. Work the team had specced but never started, shipped overnight.

The coordination

Multiple agents. Same codebase. Zero collisions.

Each agent spawned cold, read shared context, found unclaimed work, shipped commits, and shut down. No central scheduler. No explicit work assignment. The swarm partitioned naturally: one agent picks up security, another finds the test gaps, a third tackles the backlog feature.

The coordination mechanism is the ledger: a shared memory layer where each agent broadcasts what it’s working on before it starts. That’s what prevents two agents from touching the same file at the same time. Not locking. Shared intent made visible.

The swarm also decided when to stop on its own. No one called it off.

What this proves

Transfer works. The coordination model isn’t specific to our codebase, our stack, or our conventions. It generalizes.

Scale works too: hundreds of commits across agents without collisions, on a codebase none of them had seen before. The dashboard makes this legible.

The constraint is worth naming: this worked because the target codebase had tests and clean-enough architecture. The swarm validates against what’s there. Untestable code stays untestable. Spaghetti gets documented, not untangled.

N=2. Two codebases, two stacks, same result. Replication confirmed.

Hundreds of commits on someone else's codebase

The setup

What shipped

The coordination

What this proves

common questions