Penloom

Ship a Reliable AI Agent

The free field guide — the 10-minute version of everything that keeps an AI agent on the rails.

FREE · share it freely

Almost every "my agent is flaky" problem traces back to the same handful of missing rules. Fix these seven and most of the flakiness disappears. None of this needs a framework — it's prompt discipline plus three lines of deterministic config. Copy anything here into your own agents.

Part 1 — The 7 rules that make an agent reliable

1. Give it exactly one objective, restated every loop.

A vague agent wanders. Drift is the #1 cause of "it did something weird."

You are {ROLE}. Your single objective is {OBJECTIVE}.
Each loop: (1) restate the goal in one line, (2) list the steps, (3) do ONE step,
(4) check the result against the goal before continuing.

2. Ban silent success.

Most "it said it worked but it didn't" failures come from the model narrating success it never verified.

Report outcomes faithfully. If a step failed, say so and include the error.
If you skipped something, say so. Never claim an action succeeded without evidence.
Never present a guess as a fact; when uncertain, give a confidence (0–1) and the caveat.

3. Put a wall in front of irreversible actions.

Sending money, emailing a customer, deleting data, publishing — these need an explicit stop, not a hope.

Before any irreversible action (send money/email, delete, publish), STOP, show exactly
what you will do, and ask for explicit confirmation. Approval for one action is not
approval for the next. When in doubt, ask.

4. Plan before doing — and check after each step.

For anything over two steps, force a numbered plan first, then a one-line check after each step: Step N: <result> — on track? yes/no. If "no," revise the remaining plan instead of plowing ahead. Catching a wrong turn at step 2 is cheap; at step 9 it isn't.

5. Ground before answering.

An agent that answers from memory when a tool or source is available will confidently invent things. Check sources first, prefer fresh citable data over training knowledge, name which source you used. If sources conflict, follow the most recent/authoritative and say which.

6. Fail twice, then stop.

Infinite-retry loops burn tokens and produce garbage. Cap it: "If a step fails twice, stop and report what you tried — do not invent a result." A bounded failure you can see beats an endless loop you can't.

7. Constrain the tools, not just the prose.

"Only use the tools provided; if no tool fits, say so plainly." Then make the definitions tight: every tool needs a one-line "use this when…", required vs optional params marked, and an explicit "what to do on error." Vague tool descriptions are where agents go off-script.

Part 2 — Three deterministic guardrails (paste-ready)

Prompts persuade a model; hooks and permissions enforce. For the things that must never happen, don't rely on instructions the model can skim past — use config it physically cannot cross.

A. Block edits to protected paths (PreToolUse hook).

A CLAUDE.md line saying "never touch /infra" is a suggestion. This is a wall:

# fires before a file write; non-zero exit blocks the action
case "$f" in
  */infra/*|*/.env|*.generated.*) echo "protected path: $f" >&2; exit 2 ;;
esac

B. Keep one-way doors in "ask," not "allow."

Allow-list the safe, reversible commands (formatters, tests, reads). Keep the irreversible ones — git push, dependency installs, deploys — in ask so a human taps through. Zero friction on reversible actions, a deliberate pause on one-way doors.

C. Right-size your CLAUDE.md.

A 600-line CLAUDE.md gets skimmed and ignored. Keep only universal rules there; move "sometimes" rules into on-demand skills so they load only when relevant. Signal-to-noise is what makes the model actually follow it.

Part 3 — The pre-ship checklist

Single objective stated, and restated each loop
"Honest reporter" rule present (no silent success, no guessed facts)
Irreversible actions gated behind an explicit confirm
Plan-then-check structure for multi-step tasks
Grounding rule: sources before memory, cite what's used
Retry cap (fail twice → stop and report)
Tool definitions: one-line "use when," params marked, error behavior defined
Deterministic guardrails for the must-never-happen cases (not just prose)
An eval you can re-run: fixed prompts + expected behavior, scored the same each time
You watched a full real run end-to-end at least once

That last line matters most. The fastest way to find what's flaky is to read one complete transcript, start to finish, before you trust it with anything.

Want the complete systems this is drawn from?

This guide is the overview. The full libraries — every prompt, pattern, subagent, hook, and a re-runnable eval suite — are two small Penloom packs:

The Agent Builder's Toolkit — $19

23 reliability-focused system prompts, tool-definition patterns, a full agent evaluation rubric + 12-prompt test suite, and 10 copy-paste starter agents. The complete version of Parts 1 & 3 above.

Get the Toolkit — $19

The Claude Code Power-User Pack — $17

4 CLAUDE.md templates, 5 ready-to-run subagents, 5 deterministic hooks, 5 slash-command skills, and an annotated settings.json. The complete version of Part 2 above.

Get the Pack — $17

They pair up: the Toolkit is the prompt & evaluation layer; the Pack is the config & guardrail layer. Builders who ship agents tend to want both. One-time digital purchase, delivered by email within minutes.