Lesson 3 of 5 · 8 min

The real jobs it's good at.

Knowing what to hand an agent is most of the skill. There's a clear pattern to the work it does well, and an equally clear pattern to the work you should keep. Get the sorting right and the agent feels like a genuinely useful teammate. Get it wrong and you'll spend more time fixing its output than you saved.

The pattern: clear inputs, a checkable result

An agent shines when a task has well-defined inputs and a way to tell, objectively, that it's done. That's why so much of its sweet spot is work with tests or an obvious correct shape. When success is checkable, the agent can iterate against it until it's right, which is exactly the loop it's built for. When success is a matter of taste or judgement, it's on much weaker ground. Hold that pattern in your head and the rest of this list follows from it.

The work it handles well

  • Tests. Adding coverage to existing code, filling in edge cases, writing unit tests for a pure function. Clear target, checkable result. A great place to start.
  • Refactors. Renaming across the codebase, extracting a function, splitting a fat module, modernising an old pattern, with the test suite as the safety net that proves nothing broke.
  • Glue code. Wiring two APIs together, writing an adapter, mapping one shape to another. Well-understood plumbing that's tedious by hand, and a lot of the custom integrations work that joins your tools up sits right here.
  • Migrations. Mechanical, repetitive changes applied consistently across many files: a library upgrade, a syntax change, moving off a deprecated call. Agents are tireless and consistent, which is the whole job here.
  • Boilerplate. A new module, endpoint or component that follows an existing pattern. Hand it an example to copy and it'll scaffold the rest.
  • Bug reproduction and fixes. Give it a bug with clear reproduction steps and it can reproduce the failure, write a failing test that captures it, then fix until that test goes green. The failing test is the target, which is exactly what it needs.
  • Internal tools. A small script, a one-off data fix, a quick admin page for the team. Low blast radius, clear purpose, real time saved.

The thread through all of it: bounded scope, a pattern to follow or a test to satisfy, and a low cost if a draft isn't perfect. That's the zone where an agent quietly clears your backlog.

What to keep firmly with people

The flip side matters just as much. Some work is yours, and handing it to an agent is how you get confident, well-written mistakes. Keep these:

  • Architecture. How the system is shaped, the boundaries, the big trade-offs. An agent can draft inside a design; it shouldn't be the one choosing the design.
  • Security-critical paths. Authentication, authorisation, payments, anything handling sensitive data. High blast radius and subtle failure modes. A person owns these and reviews them closely.
  • Ambiguous product calls. When the "right" answer depends on what the business wants, or there's a real trade-off with no clear winner, that's a decision, not a coding task. Make the call yourself, then, if it helps, hand the agent the now-clear job.

A simple test before you delegate: is this well-defined, low-risk and checkable, or does it need judgement and carry blast radius? The first kind is agent work. The second is yours. Most tasks sort themselves cleanly once you ask.

How to sort the work: hand the agent jobs with clear inputs and a checkable result, tests, refactors, glue, migrations, boilerplate, bug fixes with reproduction steps, and internal tools. Keep architecture, security-critical paths and ambiguous product calls firmly with people. When in doubt, ask whether it's well-defined and low-risk or judgement-heavy and high-blast-radius. Next up: keeping humans in the loop.
Quick check

A few quick questions to lock it in. No marks recorded, just for you.

Q1.Which of these is well-shaped work for an agent?

Clear inputs and a checkable result: tests, refactors, glue, migrations and boilerplate are its sweet spot.

Q2.Why are mechanical, repetitive changes like a library migration good Codex work?

A consistent change with a clear shape is exactly what an agent does well, and a job like this is easy to fan out across sandboxes.

Q3.What should stay firmly with people rather than going to a Codex task?

Judgement-heavy, high-blast-radius and ambiguous work is yours. Hand over the well-defined jobs, keep the decisions.

Pick up anywhere

Save your progress

Pop your email in and we'll send you a link to pick up where you left off, on any device. No account needed.

Just for the link to your progress. No spam, and I never share your details.