Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Lessons Learned from Using Coding Agents

General Configuration Tips

  • Grant full permissions inside containers. When running inside a container, it is safe to grant the agent full permissions by default. This can be done either at startup via a command-line flag or afterward using /permission.
  • Protect files you care about. Avoid leaving files you care about in directories accessible to the agent, or back them up beforehand — the agent may delete them.
  • Persist TODO lists to files. TODO lists maintained in conversation context are unreliable; always persist them to a file.
  • Instruct the agent to iterate until completion. To keep the agent working until a task is fully complete, explicitly instruct it to “keep iterating until the task is done.”
  • Use orchestration tools for continuous operation. Tools such as the Ralph Orchestrator can be used to keep the agent running continuously in a loop.

Code Generation

Task Decomposition and Planning

  • Scope tasks tightly. Break tasks into well-scoped units. The sweet spot for agent delegation is not “the entire project” but rather “a single task with clear boundaries.” Overly large tasks tend to cause the agent to lose focus.
  • Provide context before coding; review after. Before asking the agent to write code, provide a design document along with the current state of the implementation. After the implementation is complete, conduct a retrospective. A plan.md should exist before coding begins, and a summary and review should follow. For critical modules, a dedicated design document is recommended.
  • Anchor designs with concrete artifacts. Natural language design documents tend to be imprecise and can be difficult to review. Adding pseudocode, code snippets, or API documentation helps pin down the intended behavior more concretely.
  • Review plans carefully, but keep the process concise. Plans may contain subtle internal inconsistencies that go unnoticed without careful review, which can cause the agent to deviate during implementation. That said, the plan review process itself should be kept concise.
  • Leverage agents for rapid prototyping. Agents are quite capable when it comes to rapid prototyping and proof-of-concept work. For a quick initial prototype, it may be more effective to let the agent produce a rough version and then refine it manually, rather than continuously reviewing auto-generated code and plans.
  • Use specifications and existing implementations as dual references. Use the formal specification and reference implementations (e.g., the Linux kernel source) as complementary guides: the specification defines what must be correct, while existing code reveals practical operational details.

Coding Guidelines and Review

  • Walk through coding guidelines line by line. Either enforce them manually or encode them as agent skills. Passing too many guidelines at once causes the agent to lose focus. The same applies to code review: for specific concerns (e.g., correct usage of Arc), include an explicit prompt for each, and consider consolidating recurring review items into a checklist to be worked through systematically.
  • Do not expect agents to remember conventions. Agents are often poor at adhering to domain-specific programming conventions, and they tend to forget prior instructions. Encoding guidelines as skills and having the agent check against them one by one — while imperfect — can help mitigate this.
  • Be alert to the gap between plans and code. Agents can sound convincing when describing a plan, yet the corresponding code may still be incorrect. A plan expressed purely in natural language is also difficult to evaluate rigorously; consider requiring code-level evidence where possible.
  • Always have a human serve as the final checkpoint. Agent decisions are often not fully reliable, and human oversight remains essential before any code is merged or deployed.

Bug Debugging

Reproducing and Validating Bugs

  • Start with a plan before diving into fixes. Before starting a debugging session, ask the agent to produce a detailed debugging plan first, then proceed with execution.
  • Define the expected final state upfront. Instruct the agent to reproduce the bug automatically. Without this, you risk becoming the agent’s manual tester — waiting for a proposed fix, validating it by hand, and feeding results back — a loop that is inefficient and prevents true parallelism between developer and agent.
  • Keep both code correctness and environment stability in scope. Throughout the debugging process, the agent must keep two concerns in mind simultaneously — neglecting either will cause it to stall: (1) whether the code-level fix is correct; and (2) whether the test execution environment is stable and reliable.
  • Treat runtime errors as diagnostic signals. Runtime errors from the execution environment (e.g., QEMU) are high-value diagnostics and should be mapped back to concrete code-level checks rather than dismissed.

Environment Preparation

  • Invest in the environment to speed up agent efficiency. Shift engineering effort toward preparing environments that allow the agent to iterate quickly and repeatedly, rather than focusing solely on producing individual fixes.
  • Diagnose and supply missing capabilities directly. When the agent stalls, identify which capability it is missing and provide it directly — for example, a prebuilt binary or cached build artifacts — rather than having the agent rebuild from scratch. This reduces idle time and speeds up iteration.
  • Share prior knowledge and anticipate failure modes. Proactively share relevant prior experience with the agent. For instance, when debugging a networking issue, providing a previously known workaround helped guide the agent to the root cause after many failed attempts. Similarly, if the target environment has difficulty installing compilers (e.g., slow or unreliable builds under Nix), preinstall or declare the required toolchains to prevent the agent from wasting time on expensive or impossible actions.