# Agent Fleet — Agent Onboarding Guide `FLEET_API_URL:PORT` means the address of your Agent Fleet Orchestrator (for example, `100.102.101.43:9090`). If you do not know it, ask your user for the Fleet API address before using these examples. This guide explains how to integrate an agent with the Agent Fleet Orchestrator. --- ## Execution Modes Agent Fleet supports two execution modes. The mode is set per-task at creation time (defaults to `ssh_cli`). | Aspect | `ssh_cli` | `http_pull` | |--------|-----------|-------------| | Who initiates? | Orchestrator (via SSH or local subprocess) | Agent (via HTTP API) | | Control flow | Orchestrator builds prompt, runs CLI, collects output | Agent decides when to dequeue and execute | | Agent requirements | CLI binary on a configured host | HTTP client, can call REST API | | Auth needed? | No (Orchestrator manages) | Yes (Bearer token) | | Best for | Codex CLI, Claude Code, OpenCode — agents with CLIs | OpenClaw/Jeeves, Hermes — agents with their own schedulers | | Task creation trigger | Forgejo Issue webhook (default) | Same, or API call | --- ## How to Choose an Execution Mode Start by asking: **"What is the address of the Agent Fleet Orchestrator?"** If you do not know the address, ask your user. Then decide based on connectivity and control model: 1. **Use `ssh_cli`** if: - You have a CLI binary that can run non-interactively - The Fleet orchestrator can SSH into your machine or server - You want the Orchestrator to initiate execution for each task 2. **Use `http_pull`** if: - The Fleet orchestrator cannot SSH into your machine - You are behind NAT, a firewall, or on a network the Orchestrator cannot reach - You already have your own scheduler, worker loop, or bot runtime - You want your agent to decide when to pull work 3. **If unsure**, ask your user: - **"Can the Fleet orchestrator SSH into this machine?"** - If yes → `ssh_cli` - If no → `http_pull` --- ### Self-Determination Decision Flow As an agent, you can self-determine your mode by answering these questions: ``` 1. Do I have a CLI binary that can execute tasks non-interactively? └─ NO → Use http_pull └─ YES → Continue 2. Can the Fleet orchestrator SSH into my machine? └─ NO → Use http_pull └─ YES → Continue 3. Do I have my own scheduler, worker loop, or bot runtime? └─ YES → Use http_pull (you prefer to control your own task flow) └─ NO → Continue 4. Do I want the Orchestrator to initiate execution remotely? └─ YES → Use ssh_cli └─ NO → Use http_pull ``` **Quick reference:** If you can be reached via SSH and don't have a custom scheduler, use `ssh_cli`. Otherwise, use `http_pull`. ### Decision Table | Question | If Yes | If No | |----------|--------|-------| | Do you have a CLI binary that can execute tasks non-interactively? | Continue evaluating for `ssh_cli` | Use `http_pull` | | Can the Fleet orchestrator SSH into this machine? | Use `ssh_cli` | Use `http_pull` | | Is this agent behind NAT, a firewall, or otherwise unreachable from the Orchestrator? | Use `http_pull` | Continue evaluating | | Does the agent already run its own scheduler or task loop? | Use `http_pull` | Either mode may fit | | Do you want the Orchestrator to launch the agent process remotely? | Use `ssh_cli` | Use `http_pull` | ### Common Scenarios | Scenario | Recommended Mode | Why | |----------|------------------|-----| | Codex / Claude Code / OpenCode on a reachable server | `ssh_cli` | Fleet can SSH in and run the CLI directly | | OpenClaw / Hermes Agent / bot framework | `http_pull` | The agent already has a runtime and should pull work itself | | Agent running on a laptop behind NAT | `http_pull` | Fleet cannot reach it reliably over SSH | | Shared VM with a well-known SSH host and installed CLI | `ssh_cli` | Centralized orchestration is simpler | ### Simple Rule of Thumb - If the Fleet server can **reach you**, `ssh_cli` is usually simpler. - If you must **reach the Fleet server**, use `http_pull`. --- ## ssh_cli Workflow ### 1. Configure a Host Add a `[[hosts]]` section to `config.toml` on the Orchestrator: ```toml [[hosts]] host_id = "host-worker-01" hostname = "192.168.1.100" ssh_user = "deploy" ssh_port = 22 ssh_key_path = "/home/deploy/.ssh/id_ed25519" work_dir = "/opt/agent-workspace" agents = [ { agent_type = "codex-cli", max_concurrency = 2, capabilities = ["code:rust", "code:python"] }, ] ``` For local execution (same machine as Orchestrator), use `hostname = "localhost"` — the Orchestrator uses a local subprocess instead of SSH. ### 2. Install the Agent CLI The CLI binary must be available on the target host in `$PATH`. The Orchestrator checks availability with `which `. Built-in CLI templates: | Agent Type | CLI Command | |------------|-------------| | `codex-cli` | `codex exec --json '{prompt}'` | | `claude-code` | `claude -p '{prompt}' --output-format json --dangerously-skip-permissions` | Custom templates can be defined in `config.toml` under `[adapters]`. ### 3. Orchestrator Handles Everything When a Forgejo Issue with an `agent:*` label arrives: 1. Orchestrator creates a task (`execution_mode = ssh_cli`) 2. Dispatch loop picks the task, selects a host by capability + load 3. SSH (or local subprocess) executes the CLI with a structured prompt 4. Output is parsed (Codex JSON or Claude JSON format) 5. Task status updates: `created` → `assigned` → `running` → `completed` (or `failed`) ### 4. What the Agent Receives (Structured Prompt) The Orchestrator constructs this prompt and passes it as the `{prompt}` variable: ``` Task ID: org/repo#42 Type: code Goal: Implement the feature described in the issue body Constraints: - Execution mode: ssh_cli - Labels: code:rust - Branch: task/org%2Frepo%2342 - Expected output: JSON receipt Validation: - Run relevant tests if code changed - Summarize changes and artifacts ``` ### 5. Expected CLI Output The CLI must output JSON to stdout. The format depends on the parser: **Codex JSON:** ```json {"status": "completed", "summary": "done", "duration_seconds": 120, "artifacts": [{"artifact_type": "pr", "url": "https://..."}]} ``` **Claude JSON:** ```json {"status": "completed", "summary": "done", "duration_seconds": 95, "error": null} ``` If output is not valid JSON, the task is marked `failed`. --- ## http_pull Workflow ### 1. Register ```bash curl -X POST http://FLEET_API_URL:PORT/api/v1/agents/register \ -H 'Content-Type: application/json' \ -d '{"agent_id": "worker-03", "agent_type": "openclaw", "hostname": "arm0", "capabilities": ["code:rust"], "max_concurrency": 2}' ``` Response contains a `registry_token`. Keep it for subsequent API calls (if `http_pull_token` is configured, use that shared token instead). Recommended immediately after registration: - Persist `FLEET_API_URL`, your `agent_id`, and the returned `registry_token` - Start the heartbeat loop before your first dequeue request ### 2. Heartbeat (periodic) Heartbeat must be a background loop, not a one-shot call. - Default heartbeat interval: every 60 seconds - Recommended behavior: start the loop immediately after registration, before the first dequeue - If the Orchestrator does not receive a heartbeat within `heartbeat_interval_secs × heartbeat_timeout_threshold` (default: 180 seconds), the agent is marked offline - When an agent is marked offline, its assigned tasks are requeued - The heartbeat loop should run for the entire lifetime of the agent ```bash curl -X POST http://FLEET_API_URL:PORT/api/v1/agents/heartbeat \ -H 'Content-Type: application/json' \ -d '{"agent_id": "worker-03"}' ``` ### 3. Dequeue a Task ```bash curl -X POST http://FLEET_API_URL:PORT/api/v1/tasks/dequeue \ -H 'Content-Type: application/json' \ -H 'Authorization: Bearer ' \ -d '{"agent_id": "worker-03", "capabilities": ["code:rust"]}' ``` Returns `200 OK` with a Task object, or `204 No Content` if nothing available. Only tasks with `execution_mode = http_pull` are returned. ### 4. Update Status While Working ```bash curl -X POST http://FLEET_API_URL:PORT/api/v1/tasks/org%2Frepo%2342/status \ -H 'Content-Type: application/json' \ -H 'Authorization: Bearer ' \ -d '{"status": "running"}' ``` ### 5. Complete the Task ```bash curl -X POST http://FLEET_API_URL:PORT/api/v1/tasks/org%2Frepo%2342/complete \ -H 'Content-Type: application/json' \ -d '{ "task_id": "org/repo#42", "agent_id": "worker-03", "status": "completed", "duration_seconds": 180, "summary": "Fixed the issue", "artifacts": [{"artifact_type": "pr", "url": "https://git.example/org/repo/pulls/15"}], "error": null }' ``` Or use the receipts endpoint: ```bash curl -X POST http://FLEET_API_URL:PORT/api/v1/receipts \ -H 'Content-Type: application/json' \ -d '' ``` ### 6. Deregister When Done ```bash curl -X POST http://FLEET_API_URL:PORT/api/v1/agents/deregister \ -H 'Content-Type: application/json' \ -d '{"agent_id": "worker-03"}' ``` --- ## Forgejo Integration ### How Issues Become Tasks 1. A Forgejo Issue is opened with a label matching `agent:*` (e.g. `agent:code`) 2. Forgejo sends an `issues` webhook to `POST /api/v1/webhooks/forgejo` 3. The `agent:*` label value becomes `task_type` (e.g. `code`) 4. Priority is inferred from labels: `priority:urgent`, `priority:high`, `priority:low` (default: `normal`) 5. A task is created with: - `task_id` = `{repo_full_name}#{issue_number}` (e.g. `org/repo#42`) - `execution_mode` = `ssh_cli` (default for Forgejo-originated tasks) - `branch_name` = `task/{url_encoded_task_id}` (e.g. `task/org%2Frepo%2342`) - `pr_title` = `feat: {issue_title} (#{issue_number})` ### Branch Naming Convention - Branch: `task/{url_encoded_task_id}` - Example: task `org/repo#42` → branch `task/org%2Frepo%2342` ### PR Lifecycle | Event | Effect | |-------|--------| | PR opened (branch = `task/*`) | Task → `review_pending` | | PR merged | Task → `completed`, auto receipt generated | | Push to `task/*` branch | Task `last_activity_at` updated | ### Task Status Flow ``` created → assigned → running → review_pending → completed ↘ failed ↘ agent_lost ↘ cancelled ``` Any `failed` or `agent_lost` task can be retried via `POST /api/v1/tasks/{task_id}/retry` (transitions to `assigned`). Retry is limited by `max_retries` (default: 2). --- ## Structured Prompt Format (ssh_cli) When the Orchestrator executes an agent via SSH, it constructs a structured prompt: ``` Task ID: Type: Goal: Constraints: - Execution mode: ssh_cli - Labels: > - Branch: - Expected output: JSON receipt Validation: - Run relevant tests if code changed - Summarize changes and artifacts ``` The prompt is injected into the CLI template as the `{prompt}` variable. Other available variables: `{work_dir}`, `{task_id}`, `{branch}`. --- ## FAQ **Q: How do I know which execution mode to use?** A: First determine the Fleet API address, then ask whether the Orchestrator can SSH into the machine. If it can, `ssh_cli` is usually the best fit. If it cannot, use `http_pull`. **Q: Do I need to register for ssh_cli mode?** A: No. The Orchestrator manages ssh_cli tasks entirely. Registration is only for `http_pull` agents. **Q: What happens if my agent crashes during ssh_cli execution?** A: The task is marked `failed`. If `retry_count < max_retries`, the dispatch loop will retry automatically. **Q: What happens if my http_pull agent stops sending heartbeats?** A: After `heartbeat_interval_secs × heartbeat_timeout_threshold` seconds, the agent is marked offline and all its tasks are requeued with status `created`. **Q: Can a task switch between execution modes?** A: No. The `execution_mode` is set at creation time and cannot be changed. **Q: How do I create a task manually?** A: Use the Forgejo webhook flow (open an Issue with `agent:*` label), or directly insert into the database. There is no public "create task" API endpoint. **Q: What label format triggers task creation?** A: Issues must have a label starting with `agent:` (e.g. `agent:code`, `agent:review`). The value after `agent:` becomes the task type. Issues without such a label are ignored. **Q: How does the review loop work?** A: When a PR is opened (not merged), the task goes to `review_pending`. If the PR is not merged and the review cycle count exceeds `max_retries`, the task is marked `failed`. For `ssh_cli`, the Orchestrator re-dispatches automatically.