Lesson 4 of 5 · 7 min

Keeping humans in the loop.

This is the lesson that keeps an agent an asset rather than a liability. None of it is exotic. It's mostly the disciplines you already apply to human contributors, held to just as firmly when the contributor is an agent. Get these habits right and you can move fast without the part where it goes quietly wrong.

Review every pull request, full stop

The non-negotiable habit: an agent is a contributor, and you read its work like any other contributor's. Every diff gets a real review before it merges, the same scrutiny you'd give a capable junior. The trap is that agent output looks polished and the tests are green, so it's tempting to wave it through. Resist that. Green tests prove the code does what the tests check, not that it does the right thing, handles the edge you forgot, or avoids a subtle issue the suite never covered. Read it like you mean it: is the approach sound, is anything off, would you have written it this way? Confidence in the output is not the same as correctness, and the review is where you supply the judgement the agent can't.

Keep production secrets well clear

This is where the real damage lives, so it gets its own rule. Never expose production secrets to an agent. Don't paste live API keys, database credentials or production tokens into a task because it might need them. Two habits keep you safe:

  • Least privilege, always. If a task genuinely needs access to something, give it a scoped, least-privilege token that can do that one thing and nothing else, never a broad credential and never the access you'd give a senior engineer with full trust.
  • Secrets live outside the work. Keep them in your secrets manager or environment, never committed to the repo and never in the task description. If a diff would ever print or commit a secret, that's a hard stop in review.

The test is simple: if this token leaked through a bad diff or a logged error, how bad would it be? Scope every grant so the honest answer is "not very."

Know what not to hand over

From the last lesson, but it bears repeating here as a safety rule, not just a productivity one: keep architecture, security-critical paths and ambiguous product calls with people. Handing those to an agent isn't just lower quality, it's a risk, because a confident wrong answer on auth or payments is exactly the kind of mistake that's expensive to catch and worse to ship. Delegate the well-defined, low-risk work. Own the decisions and the dangerous paths.

Let tests and CI be the guardrails

Your pipeline is doing more work than ever now, so make it earn its keep. The agent works on a branch and opens a pull request; protected branches, required status checks and required reviews mean nothing reaches main without passing the tests and getting a human sign-off. If your coverage is thin in an area you're handing to an agent, that's a strong reason to thicken it, because the tests are how you catch a confident mistake automatically before a person even looks. Treat good CI as the floor that makes the whole thing safe to lean on.

Keep it auditable

One quiet benefit of doing all this through normal version control: it's auditable by default. Every change is on a branch, in a pull request, with a CI record and a named reviewer who signed off. Months later you can see exactly what changed, why, and who approved it. That trail matters anywhere you answer to a customer, a standard or an auditor, and it's free if you keep the agent inside your ordinary review flow rather than letting it work around it. Resist any shortcut that bypasses the pull request, because the shortcut is also where the audit trail disappears.

The habits to keep: read and review every pull request like a contributor's, no matter how green the tests look; keep production secrets out entirely and use scoped, least-privilege tokens for any access; keep architecture, security-critical paths and ambiguous calls with people; lean on tests and CI as automatic guardrails; and keep every change on a branch, in a reviewed pull request, so the whole thing stays auditable. Last lesson next: putting it to work.
Quick check

A few quick questions to lock it in. No marks recorded, just for you.

Q1.When several Codex tasks land as diffs at once, what's the non-negotiable habit?

Parallel work means more diffs, not lighter review. Each one gets read by a person, however green the tests look.

Q2.How should a cloud sandbox's access to secrets be handled?

Never expose prod secrets to the sandbox. Scope its access tightly so a mistake or a bad diff can't reach anything that matters.

Q3.Why keep the whole thing auditable?

Branches, pull requests, CI and a clear reviewer mean you can always see what changed, why, and who signed off.

Pick up anywhere

Save your progress

Pop your email in and we'll send you a link to pick up where you left off, on any device. No account needed.

Just for the link to your progress. No spam, and I never share your details.