Build Guide

A practical checklist and workflow for building agents, orchestrators, and automations without skipping contracts, limits, or safety rails. Use the glossary for definitions. Use this page for the build sequence.

1. Define The Agent

Name the unit clearly. State what it exists to do, what it should never do, and what a successful run looks like.

Ask: what job does this agent own, and what job does it explicitly not own?

2. Define Inputs And Outputs

Before tools, define the contract. What data comes in, what shape goes out, and what must be present before the agent is allowed to run?

  • Input source
  • Required fields
  • Output format
  • Validation rules

3. Define Tool Access

Decide which tools the agent may use and keep that allowlist small. Too much access is usually where avoidable risk enters the system.

Good default: narrow command allowlist, bounded network access, no arbitrary shell.

4. Define Quality Gates

Make pass/fail checks explicit. Do not rely on a vague sense that the output "looks fine" after the fact.

  • Schema or structure checks
  • Safety or policy checks
  • Tests or smoke checks
  • Approval checkpoints where needed

5. Define Fail-Closed Behavior

Decide what happens when inputs are missing, validation fails, a remote system is down, or an answer is uncertain. Good systems stop, escalate, or wait. They do not improvise through unclear state.

6. Define Runtime Limits

Set caps on time, resource use, retries, and concurrency. An agent that can run forever, retry forever, or consume everything is not a reliable operator.

  • Time limit per run
  • Retry limit
  • CPU, RAM, and thermal thresholds
  • Queue depth or concurrency cap

7. Define Observability

If you cannot inspect what happened later, you do not have a trustworthy system. Log commands, outcomes, failures, and decision points.

Minimum useful record: input summary, action taken, result, exit status, and evidence link.

8. Start Small, Then Promote

First prove the workflow in a bounded environment. Then add automation, scheduling, and remote control. Promotion without a stable small version usually creates drift instead of leverage.

Example: Define Agent

A useful first draft is simple: "This agent checks infrastructure health every morning, reports pass/fail status, and escalates only when a threshold is breached. It may run known probes, but it may not execute arbitrary remote commands."

Related references: Glossary for terms and Agents for the live system inventory. Use the glossary to define the language, then use this guide to build the system.