Same feature. Same complexity. Two tools. Here’s what the time breakdown actually looked like.
The Copilot Run: 6 Hours
The feature required touching several parts of the codebase — not complex architecture, but enough surface area that context management became the bottleneck.
Time breakdown:
- Workflow gates: ~45 minutes stopping to approve, configure handoffs, and re-establish context between steps
- Context files: ~60 minutes curating what to inject at each stage, what was safe to drop, what the next subagent needed to know
- Subagent orchestration: ~90 minutes managing the handoff chain — the primary agent delegating to subagents, subagents returning results, stitching outputs together
- Compactions: ~40 minutes waiting for context compression when sessions got too long, then validating that nothing important was dropped
- Actual implementation review: the remaining time
The tool wasn’t slow because the AI was bad. The models are capable. It was slow because I spent most of that time feeding it — maintaining the context it couldn’t hold, managing the handoffs it couldn’t coordinate, filling in the gaps left by compaction.
The Claude Code Run: 45 Minutes
The same feature:
- Opened with a brief task description and pointed at the relevant files
- Let it run
- Reviewed the output
No context files to curate. No workflow gates to manage. No compaction to baby-sit. The tool held context across the session, managed its own memory, and surfaced decisions that actually needed a human rather than routing everything through an approval queue.
Forty-five minutes, start to review.
The Real Variable
The productivity gap isn’t model intelligence. Current frontier models are all capable of this class of work.
The variable is overhead — specifically, how much overhead sits between the human and the actual problem.
A tool that requires you to manage its context is a tool that has offloaded its complexity onto you. You feel productive because you’re busy. But most of that activity is coordination, not creation.
The right mental model: how many of my decisions today were about the work itself vs. about managing the AI?
On a good day with the right tool, that ratio should be heavily weighted toward the work.
What This Changes
Once you experience the low-overhead version, it’s hard to go back to high-overhead tooling. Not because the other tool is worse on technical benchmarks — often it isn’t — but because the experience of doing actual work without constant interruption is qualitatively different.
The best AI collaboration feels like pair programming with someone who doesn’t need to be managed. You handle the decisions that require judgment. They handle the implementation. When you need to course-correct, you do. The rest of the time, work happens.
That’s the bar. A 45-minute task shouldn’t take 6 hours of your attention.
AI Minus the Friction #3. Original post on LinkedIn.