docs: add agent API reference, onboarding guide, and universal skill

- docs/agent-api-reference.md (473 lines): complete HTTP API reference for all 12 endpoints - docs/agent-onboarding-guide.md (272 lines): ssh_cli and http_pull workflows, Forgejo integration - skill/SKILL.md (281 lines): universal agent skill, platform-agnostic, curl-based examples All content in English. No code changes.
2026-05-12 14:57:05 +08:00 · 2026-05-12 14:57:05 +08:00 · d1a746a8cb
commit d1a746a8cb
parent e39a16498c
9 changed files with 1250 additions and 0 deletions
--- a/docs/agent-onboarding-guide.md
+++ b/docs/agent-onboarding-guide.md
@ -0,0 +1,272 @@
+# Agent Fleet — Agent Onboarding Guide
+
+This guide explains how to integrate an agent with the Agent Fleet Orchestrator.
+
+---
+
+## Execution Modes
+
+Agent Fleet supports two execution modes. The mode is set per-task at creation time (defaults to `ssh_cli`).
+
+| Aspect | `ssh_cli` | `http_pull` |
+|--------|-----------|-------------|
+| Who initiates? | Orchestrator (via SSH or local subprocess) | Agent (via HTTP API) |
+| Control flow | Orchestrator builds prompt, runs CLI, collects output | Agent decides when to dequeue and execute |
+| Agent requirements | CLI binary on a configured host | HTTP client, can call REST API |
+| Auth needed? | No (Orchestrator manages) | Yes (Bearer token) |
+| Best for | Codex CLI, Claude Code, OpenCode — agents with CLIs | OpenClaw/Jeeves, Hermes — agents with their own schedulers |
+| Task creation trigger | Forgejo Issue webhook (default) | Same, or API call |
+
+---
+
+## ssh_cli Workflow
+
+### 1. Configure a Host
+
+Add a `[[hosts]]` section to `config.toml` on the Orchestrator:
+
+```toml
+[[hosts]]
+host_id = "host-worker-01"
+hostname = "192.168.1.100"
+ssh_user = "deploy"
+ssh_port = 22
+ssh_key_path = "/home/deploy/.ssh/id_ed25519"
+work_dir = "/opt/agent-workspace"
+agents = [
+  { agent_type = "codex-cli", max_concurrency = 2, capabilities = ["code:rust", "code:python"] },
+]
+```
+
+For local execution (same machine as Orchestrator), use `hostname = "localhost"` — the Orchestrator uses a local subprocess instead of SSH.
+
+### 2. Install the Agent CLI
+
+The CLI binary must be available on the target host in `$PATH`. The Orchestrator checks availability with `which <binary>`.
+
+Built-in CLI templates:
+
+| Agent Type | CLI Command |
+|------------|-------------|
+| `codex-cli` | `codex exec --json '{prompt}'` |
+| `claude-code` | `claude -p '{prompt}' --output-format json --dangerously-skip-permissions` |
+
+Custom templates can be defined in `config.toml` under `[adapters]`.
+
+### 3. Orchestrator Handles Everything
+
+When a Forgejo Issue with an `agent:*` label arrives:
+
+1. Orchestrator creates a task (`execution_mode = ssh_cli`)
+2. Dispatch loop picks the task, selects a host by capability + load
+3. SSH (or local subprocess) executes the CLI with a structured prompt
+4. Output is parsed (Codex JSON or Claude JSON format)
+5. Task status updates: `created` → `assigned` → `running` → `completed` (or `failed`)
+
+### 4. What the Agent Receives (Structured Prompt)
+
+The Orchestrator constructs this prompt and passes it as the `{prompt}` variable:
+
+```
+Task ID: org/repo#42
+Type: code
+Goal:
+Implement the feature described in the issue body
+
+Constraints:
+- Execution mode: ssh_cli
+- Labels: code:rust
+- Branch: task/org%2Frepo%2342
+- Expected output: JSON receipt
+
+Validation:
+- Run relevant tests if code changed
+- Summarize changes and artifacts
+```
+
+### 5. Expected CLI Output
+
+The CLI must output JSON to stdout. The format depends on the parser:
+
+**Codex JSON:**
+```json
+{"status": "completed", "summary": "done", "duration_seconds": 120, "artifacts": [{"artifact_type": "pr", "url": "https://..."}]}
+```
+
+**Claude JSON:**
+```json
+{"status": "completed", "summary": "done", "duration_seconds": 95, "error": null}
+```
+
+If output is not valid JSON, the task is marked `failed`.
+
+---
+
+## http_pull Workflow
+
+### 1. Register
+
+```bash
+curl -X POST http://localhost:9090/api/v1/agents/register \
+  -H 'Content-Type: application/json' \
+  -d '{"agent_id": "worker-03", "agent_type": "openclaw", "hostname": "arm0", "capabilities": ["code:rust"], "max_concurrency": 2}'
+```
+
+Response contains a `registry_token`. Keep it for subsequent API calls (if `http_pull_token` is configured, use that shared token instead).
+
+### 2. Heartbeat (periodic)
+
+Send a heartbeat every N seconds (default interval: 60s). If the Orchestrator doesn't receive one within `heartbeat_interval_secs × heartbeat_timeout_threshold`, the agent is marked offline and its tasks are requeued.
+
+```bash
+curl -X POST http://localhost:9090/api/v1/agents/heartbeat \
+  -H 'Content-Type: application/json' \
+  -d '{"agent_id": "worker-03"}'
+```
+
+### 3. Dequeue a Task
+
+```bash
+curl -X POST http://localhost:9090/api/v1/tasks/dequeue \
+  -H 'Content-Type: application/json' \
+  -H 'Authorization: Bearer <token>' \
+  -d '{"agent_id": "worker-03", "capabilities": ["code:rust"]}'
+```
+
+Returns `200 OK` with a Task object, or `204 No Content` if nothing available.
+
+Only tasks with `execution_mode = http_pull` are returned.
+
+### 4. Update Status While Working
+
+```bash
+curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/status \
+  -H 'Content-Type: application/json' \
+  -H 'Authorization: Bearer <token>' \
+  -d '{"status": "running"}'
+```
+
+### 5. Complete the Task
+
+```bash
+curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/complete \
+  -H 'Content-Type: application/json' \
+  -d '{
+    "task_id": "org/repo#42",
+    "agent_id": "worker-03",
+    "status": "completed",
+    "duration_seconds": 180,
+    "summary": "Fixed the issue",
+    "artifacts": [{"artifact_type": "pr", "url": "https://git.example/org/repo/pulls/15"}],
+    "error": null
+  }'
+```
+
+Or use the receipts endpoint:
+
+```bash
+curl -X POST http://localhost:9090/api/v1/receipts \
+  -H 'Content-Type: application/json' \
+  -d '<same receipt body>'
+```
+
+### 6. Deregister When Done
+
+```bash
+curl -X POST http://localhost:9090/api/v1/agents/deregister \
+  -H 'Content-Type: application/json' \
+  -d '{"agent_id": "worker-03"}'
+```
+
+---
+
+## Forgejo Integration
+
+### How Issues Become Tasks
+
+1. A Forgejo Issue is opened with a label matching `agent:*` (e.g. `agent:code`)
+2. Forgejo sends an `issues` webhook to `POST /api/v1/webhooks/forgejo`
+3. The `agent:*` label value becomes `task_type` (e.g. `code`)
+4. Priority is inferred from labels: `priority:urgent`, `priority:high`, `priority:low` (default: `normal`)
+5. A task is created with:
+   - `task_id` = `{repo_full_name}#{issue_number}` (e.g. `org/repo#42`)
+   - `execution_mode` = `ssh_cli` (default for Forgejo-originated tasks)
+   - `branch_name` = `task/{url_encoded_task_id}` (e.g. `task/org%2Frepo%2342`)
+   - `pr_title` = `feat: {issue_title} (#{issue_number})`
+
+### Branch Naming Convention
+
+- Branch: `task/{url_encoded_task_id}`
+- Example: task `org/repo#42` → branch `task/org%2Frepo%2342`
+
+### PR Lifecycle
+
+| Event | Effect |
+|-------|--------|
+| PR opened (branch = `task/*`) | Task → `review_pending` |
+| PR merged | Task → `completed`, auto receipt generated |
+| Push to `task/*` branch | Task `last_activity_at` updated |
+
+### Task Status Flow
+
+```
+created → assigned → running → review_pending → completed
+                               ↘ failed
+                  ↘ agent_lost
+         ↘ cancelled
+```
+
+Any `failed` or `agent_lost` task can be retried via `POST /api/v1/tasks/{task_id}/retry` (transitions to `assigned`). Retry is limited by `max_retries` (default: 2).
+
+---
+
+## Structured Prompt Format (ssh_cli)
+
+When the Orchestrator executes an agent via SSH, it constructs a structured prompt:
+
+```
+Task ID: <task_id>
+Type: <task_type>
+Goal:
+<requirements>
+
+Constraints:
+- Execution mode: ssh_cli
+- Labels: <comma-separated labels or <none>>
+- Branch: <branch_name>
+- Expected output: JSON receipt
+
+Validation:
+- Run relevant tests if code changed
+- Summarize changes and artifacts
+```
+
+The prompt is injected into the CLI template as the `{prompt}` variable. Other available variables: `{work_dir}`, `{task_id}`, `{branch}`.
+
+---
+
+## FAQ
+
+**Q: How do I know which execution mode to use?**
+A: If you have a CLI binary and run on a configured host → `ssh_cli`. If you have your own scheduler or run outside configured hosts → `http_pull`.
+
+**Q: Do I need to register for ssh_cli mode?**
+A: No. The Orchestrator manages ssh_cli tasks entirely. Registration is only for `http_pull` agents.
+
+**Q: What happens if my agent crashes during ssh_cli execution?**
+A: The task is marked `failed`. If `retry_count < max_retries`, the dispatch loop will retry automatically.
+
+**Q: What happens if my http_pull agent stops sending heartbeats?**
+A: After `heartbeat_interval_secs × heartbeat_timeout_threshold` seconds, the agent is marked offline and all its tasks are requeued with status `created`.
+
+**Q: Can a task switch between execution modes?**
+A: No. The `execution_mode` is set at creation time and cannot be changed.
+
+**Q: How do I create a task manually?**
+A: Use the Forgejo webhook flow (open an Issue with `agent:*` label), or directly insert into the database. There is no public "create task" API endpoint.
+
+**Q: What label format triggers task creation?**
+A: Issues must have a label starting with `agent:` (e.g. `agent:code`, `agent:review`). The value after `agent:` becomes the task type. Issues without such a label are ignored.
+
+**Q: How does the review loop work?**
+A: When a PR is opened (not merged), the task goes to `review_pending`. If the PR is not merged and the review cycle count exceeds `max_retries`, the task is marked `failed`. For `ssh_cli`, the Orchestrator re-dispatches automatically.