docs: add agent API reference, onboarding guide, and universal skill

- docs/agent-api-reference.md (473 lines): complete HTTP API reference for all 12 endpoints - docs/agent-onboarding-guide.md (272 lines): ssh_cli and http_pull workflows, Forgejo integration - skill/SKILL.md (281 lines): universal agent skill, platform-agnostic, curl-based examples All content in English. No code changes.
2026-05-12 14:57:05 +08:00 · 2026-05-12 14:57:05 +08:00 · d1a746a8cb
commit d1a746a8cb
parent e39a16498c
9 changed files with 1250 additions and 0 deletions
--- a/docs/agent-api-reference.md
+++ b/docs/agent-api-reference.md
@ -0,0 +1,473 @@
 # Agent Fleet — HTTP API Reference
 Base URL: `http://<host>:9090`
 Content-Type: `application/json` for all request/response bodies unless noted.
 All timestamps are ISO 8601 (RFC 3339).
 ---
 ## Authentication
 ### http_pull Bearer Token
 Endpoints that are specific to `http_pull` agents require a Bearer token in the `Authorization` header. The token is configured in `config.toml` as `orchestrator.http_pull_token`. If no token is configured in the config, authentication is skipped (open mode).
 ```
 Authorization: Bearer <http_pull_token>
 ```
 Affected endpoints: `POST /api/v1/tasks/dequeue`, `POST /api/v1/tasks/{task_id}/status`.
 ### Webhook HMAC-SHA256
 The `POST /api/v1/webhooks/forgejo` endpoint requires an `X-Hub-Signature-256` (or `X-Gitea-Signature` / `X-Forgejo-Signature`) header containing `sha256=<hex_hmac>` of the request body using the configured `webhook_secret`.
 ```
 X-Hub-Signature-256: sha256=abcdef...
 ```
 ---
 ## Error Responses
 All errors return JSON:
 ```json
 { "error": "<human-readable message>" }
 ```
 | Status | Meaning | Trigger |
 |--------|---------|---------|
 | 400 | Bad Request | Invalid state transition, wrong execution_mode, malformed input |
 | 401 | Unauthorized | Missing or invalid Bearer token for http_pull endpoints |
 | 404 | Not Found | Task or agent does not exist |
 | 500 | Internal Server Error | Database failure, lock poisoning, unexpected errors |
 ---
 ## Endpoints
 ### Health Check
 ```
 GET /healthz
 ```
 **Response:** `200 OK` — body: `ok`
 ```bash
 curl http://localhost:9090/healthz
 ```
 ---
 ### Register Agent
 ```
 POST /api/v1/agents/register
 ```
 Register a new agent or update an existing one (upsert by `agent_id`).
 **Request:**
 | Field | Type | Required | Description |
 |-------|------|----------|-------------|
 | agent_id | string | yes | Unique identifier |
 | agent_type | string | yes | `openclaw`, `claude-code`, `codex-cli`, `hermes`, `acp`, `shell`, or custom |
 | hostname | string | yes | Machine hostname |
 | capabilities | string[] | yes | e.g. `["code:rust", "review"]` |
 | max_concurrency | u32 | yes | Max parallel tasks |
 | metadata | object | no | Arbitrary key-value pairs |
 **Response:** `200 OK`
 ```json
 {
  "agent_id": "worker-01",
  "registry_token": "registry_a1b2c3d4..."
 }
 ```
 ```bash
 curl -X POST http://localhost:9090/api/v1/agents/register \
  -H 'Content-Type: application/json' \
  -d '{
    "agent_id": "worker-01",
    "agent_type": "codex-cli",
    "hostname": "host-worker-01",
    "capabilities": ["code:rust"],
    "max_concurrency": 2
  }'
 ```
 ---
 ### Heartbeat
 ```
 POST /api/v1/agents/heartbeat
 ```
 **Request:**
 | Field | Type | Required | Description |
 |-------|------|----------|-------------|
 | agent_id | string | yes | Agent to update |
 **Response:** `200 OK`
 ```json
 {
  "agent_id": "worker-01",
  "status": "online",
  "last_heartbeat_at": "2025-01-15T10:30:00Z"
 }
 ```
 **Errors:** `404` if agent not found.
 ```bash
 curl -X POST http://localhost:9090/api/v1/agents/heartbeat \
  -H 'Content-Type: application/json' \
  -d '{"agent_id": "worker-01"}'
 ```
 ---
 ### Deregister Agent
 ```
 POST /api/v1/agents/deregister
 ```
 Sets agent offline and requeues all its active tasks back to `created`.
 **Request:**
 | Field | Type | Required |
 |-------|------|----------|
 | agent_id | string | yes |
 **Response:** `200 OK`
 ```json
 {
  "agent_id": "worker-01",
  "status": "offline",
  "requeued_tasks": 3
 }
 ```
 ```bash
 curl -X POST http://localhost:9090/api/v1/agents/deregister \
  -H 'Content-Type: application/json' \
  -d '{"agent_id": "worker-01"}'
 ```
 ---
 ### List Agents
 ```
 GET /api/v1/agents
 ```
 **Query Parameters:**
 | Param | Type | Description |
 |-------|------|-------------|
 | capability | string | Filter by capability (e.g. `code:rust`) |
 | status | string | Filter: `online`, `offline`, `draining` |
 **Response:** `200 OK` — JSON array of [Agent](#agent-object) objects.
 ```bash
 curl 'http://localhost:9090/api/v1/agents?status=online'
 ```
 ---
 ### List Tasks
 ```
 GET /api/v1/tasks
 ```
 **Query Parameters:**
 | Param | Type | Description |
 |-------|------|-------------|
 | status | string | Filter by status (e.g. `created`, `running`, `failed`) |
 | agent_id | string | Filter by assigned agent |
 **Response:** `200 OK` — JSON array of [Task](#task-object) objects. Ordered by `created_at` descending.
 ```bash
 curl 'http://localhost:9090/api/v1/tasks?status=running'
 ```
 ---
 ### Get Task
 ```
 GET /api/v1/tasks/{task_id}
 ```
 **Response:** `200 OK` — single [Task](#task-object) object.
 **Errors:** `404` if task not found.
 ```bash
 curl http://localhost:9090/api/v1/tasks/org%2Frepo%2342
 ```
 ---
 ### Dequeue Task (http_pull only)
 ```
 POST /api/v1/tasks/dequeue
 ```
 Requires Bearer token if `http_pull_token` is configured. Only returns tasks with `execution_mode = http_pull`.
 **Request:**
 | Field | Type | Required | Description |
 |-------|------|----------|-------------|
 | agent_id | string | yes | Agent claiming the task |
 | capabilities | string[] | no | Capabilities to match against task labels |
 **Response:** `200 OK` with [Task](#task-object) object, or `204 No Content` if no matching task.
 **Errors:** `401` if token required and missing/invalid.
 ```bash
 curl -X POST http://localhost:9090/api/v1/tasks/dequeue \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer my-token' \
  -d '{"agent_id": "worker-03", "capabilities": ["code:rust"]}'
 ```
 ---
 ### Update Task Status (http_pull only)
 ```
 POST /api/v1/tasks/{task_id}/status
 ```
 Requires Bearer token. Only works for tasks with `execution_mode = http_pull`.
 **Request:**
 | Field | Type | Required | Description |
 |-------|------|----------|-------------|
 | status | string | yes | Target status: `running`, `review_pending`, etc. |
 **Response:** `200 OK` — updated [Task](#task-object).
 **Errors:** `400` if task is not `http_pull` mode or transition is invalid. `404` if task not found.
 ```bash
 curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/status \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer my-token' \
  -d '{"status": "running"}'
 ```
 ---
 ### Complete Task
 ```
 POST /api/v1/tasks/{task_id}/complete
 ```
 Works for both `ssh_cli` and `http_pull` tasks. Submit a receipt to mark the task done.
 **Request:** A [Receipt](#receipt-object) object.
 **Response:** `200 OK`
 ```json
 {
  "task_id": "org/repo#42",
  "status": "completed"
 }
 ```
 **Errors:** `404` if task not found. `400` if task is not in a completable state.
 ```bash
 curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/complete \
  -H 'Content-Type: application/json' \
  -d '{
    "task_id": "org/repo#42",
    "agent_id": "worker-01",
    "status": "completed",
    "duration_seconds": 120,
    "summary": "Implemented feature X",
    "artifacts": [
      {"artifact_type": "pr", "url": "https://git.example/org/repo/pulls/7"}
    ],
    "error": null
  }'
 ```
 ---
 ### Retry Task
 ```
 POST /api/v1/tasks/{task_id}/retry
 ```
 Retry a `failed` or `agent_lost` task. Transitions it back to `assigned`.
 **Response:** `200 OK` — updated [Task](#task-object).
 **Errors:** `400` if task status is not `failed` or `agent_lost`. `404` if task not found.
 ```bash
 curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/retry
 ```
 ---
 ### Submit Receipt
 ```
 POST /api/v1/receipts
 ```
 Submit a receipt for a task. Validates artifacts (e.g. checks PR exists via Forgejo API).
 **Request:** A [Receipt](#receipt-object) object.
 **Response:** `200 OK`
 **Errors:** `404` if task not found. `400` if validation fails.
 ```bash
 curl -X POST http://localhost:9090/api/v1/receipts \
  -H 'Content-Type: application/json' \
  -d '{
    "task_id": "org/repo#42",
    "agent_id": "worker-01",
    "status": "completed",
    "duration_seconds": 95,
    "summary": "Fixed the bug",
    "artifacts": [],
    "error": null
  }'
 ```
 ---
 ### Forgejo Webhook
 ```
 POST /api/v1/webhooks/forgejo
 ```
 Receives Forgejo webhook events. Requires HMAC-SHA256 signature header.
 **Headers:** `X-Forgejo-Event` or `X-Gitea-Event` determines the event type.
 **Supported events:**
 | Event | Action |
 |-------|--------|
 | `issues` (opened) | Creates a task from the Issue (requires `agent:*` label) |
 | `pull_request` (opened) | Sets task to `review_pending` (branch name → task_id) |
 | `pull_request` (merged/closed with `merged: true`) | Sets task to `completed`, auto-generates receipt |
 | `push` (to `task/*` branch) | Updates `last_activity_at` on the task |
 **Response:** `200 OK`
 ```json
 {
  "accepted": true,
  "task_id": "org/repo#42"
 }
 ```
 **Errors:** `401` if signature invalid. `400` if payload unparseable.
 ---
 ## Object Schemas
 ### Agent Object
 ```json
 {
  "agent_id": "worker-01",
  "agent_type": "codex-cli",
  "hostname": "host-worker-01",
  "capabilities": ["code:rust"],
  "max_concurrency": 2,
  "current_tasks": 1,
  "status": "online",
  "last_heartbeat_at": "2025-01-15T10:30:00Z",
  "registered_at": "2025-01-15T09:00:00Z",
  "metadata": {}
 }
 ```
 ### Task Object
 ```json
 {
  "task_id": "org/repo#42",
  "source": "forgejo:org/repo#42",
  "task_type": "code",
  "priority": "normal",
  "status": "created",
  "execution_mode": "ssh_cli",
  "assigned_agent_id": null,
  "assigned_host": null,
  "requirements": "Implement the feature described in the issue body",
  "labels": ["agent:code", "code:rust"],
  "branch_name": "task/org%2Frepo%2342",
  "pr_title": "feat: Implement feature (#42)",
  "created_at": "2025-01-15T10:00:00Z",
  "assigned_at": null,
  "started_at": null,
  "completed_at": null,
  "last_activity_at": null,
  "retry_count": 0,
  "max_retries": 2,
  "review_count": 0,
  "timeout_seconds": 1800
 }
 ```
 **Status values:** `created`, `assigned`, `running`, `review_pending`, `completed`, `failed`, `agent_lost`, `cancelled`
 **Priority values:** `low`, `normal`, `high`, `urgent`
 **Execution mode values:** `ssh_cli`, `http_pull`
 ### Receipt Object
 ```json
 {
  "task_id": "org/repo#42",
  "agent_id": "worker-01",
  "status": "completed",
  "duration_seconds": 120,
  "summary": "Implemented the feature",
  "artifacts": [
    {"artifact_type": "pr", "url": "https://git.example/org/repo/pulls/7", "path": null, "description": null}
  ],
  "error": null
 }
 ```
 **Receipt status values:** `completed`, `failed`, `partial`
 **Artifact type values:** `pr`, `commit`, `file`, `comment`, `url`
--- a/docs/agent-onboarding-guide.md
+++ b/docs/agent-onboarding-guide.md
@ -0,0 +1,272 @@
 # Agent Fleet — Agent Onboarding Guide
 This guide explains how to integrate an agent with the Agent Fleet Orchestrator.
 ---
 ## Execution Modes
 Agent Fleet supports two execution modes. The mode is set per-task at creation time (defaults to `ssh_cli`).
 | Aspect | `ssh_cli` | `http_pull` |
 |--------|-----------|-------------|
 | Who initiates? | Orchestrator (via SSH or local subprocess) | Agent (via HTTP API) |
 | Control flow | Orchestrator builds prompt, runs CLI, collects output | Agent decides when to dequeue and execute |
 | Agent requirements | CLI binary on a configured host | HTTP client, can call REST API |
 | Auth needed? | No (Orchestrator manages) | Yes (Bearer token) |
 | Best for | Codex CLI, Claude Code, OpenCode — agents with CLIs | OpenClaw/Jeeves, Hermes — agents with their own schedulers |
 | Task creation trigger | Forgejo Issue webhook (default) | Same, or API call |
 ---
 ## ssh_cli Workflow
 ### 1. Configure a Host
 Add a `[[hosts]]` section to `config.toml` on the Orchestrator:
 ```toml
 [[hosts]]
 host_id = "host-worker-01"
 hostname = "192.168.1.100"
 ssh_user = "deploy"
 ssh_port = 22
 ssh_key_path = "/home/deploy/.ssh/id_ed25519"
 work_dir = "/opt/agent-workspace"
 agents = [
  { agent_type = "codex-cli", max_concurrency = 2, capabilities = ["code:rust", "code:python"] },
 ]
 ```
 For local execution (same machine as Orchestrator), use `hostname = "localhost"` — the Orchestrator uses a local subprocess instead of SSH.
 ### 2. Install the Agent CLI
 The CLI binary must be available on the target host in `$PATH`. The Orchestrator checks availability with `which <binary>`.
 Built-in CLI templates:
 | Agent Type | CLI Command |
 |------------|-------------|
 | `codex-cli` | `codex exec --json '{prompt}'` |
 | `claude-code` | `claude -p '{prompt}' --output-format json --dangerously-skip-permissions` |
 Custom templates can be defined in `config.toml` under `[adapters]`.
 ### 3. Orchestrator Handles Everything
 When a Forgejo Issue with an `agent:*` label arrives:
 1. Orchestrator creates a task (`execution_mode = ssh_cli`)
 2. Dispatch loop picks the task, selects a host by capability + load
 3. SSH (or local subprocess) executes the CLI with a structured prompt
 4. Output is parsed (Codex JSON or Claude JSON format)
 5. Task status updates: `created` → `assigned` → `running` → `completed` (or `failed`)
 ### 4. What the Agent Receives (Structured Prompt)
 The Orchestrator constructs this prompt and passes it as the `{prompt}` variable:
 ```
 Task ID: org/repo#42
 Type: code
 Goal:
 Implement the feature described in the issue body
 Constraints:
 - Execution mode: ssh_cli
 - Labels: code:rust
 - Branch: task/org%2Frepo%2342
 - Expected output: JSON receipt
 Validation:
 - Run relevant tests if code changed
 - Summarize changes and artifacts
 ```
 ### 5. Expected CLI Output
 The CLI must output JSON to stdout. The format depends on the parser:
 **Codex JSON:**
 ```json
 {"status": "completed", "summary": "done", "duration_seconds": 120, "artifacts": [{"artifact_type": "pr", "url": "https://..."}]}
 ```
 **Claude JSON:**
 ```json
 {"status": "completed", "summary": "done", "duration_seconds": 95, "error": null}
 ```
 If output is not valid JSON, the task is marked `failed`.
 ---
 ## http_pull Workflow
 ### 1. Register
 ```bash
 curl -X POST http://localhost:9090/api/v1/agents/register \
  -H 'Content-Type: application/json' \
  -d '{"agent_id": "worker-03", "agent_type": "openclaw", "hostname": "arm0", "capabilities": ["code:rust"], "max_concurrency": 2}'
 ```
 Response contains a `registry_token`. Keep it for subsequent API calls (if `http_pull_token` is configured, use that shared token instead).
 ### 2. Heartbeat (periodic)
 Send a heartbeat every N seconds (default interval: 60s). If the Orchestrator doesn't receive one within `heartbeat_interval_secs × heartbeat_timeout_threshold`, the agent is marked offline and its tasks are requeued.
 ```bash
 curl -X POST http://localhost:9090/api/v1/agents/heartbeat \
  -H 'Content-Type: application/json' \
  -d '{"agent_id": "worker-03"}'
 ```
 ### 3. Dequeue a Task
 ```bash
 curl -X POST http://localhost:9090/api/v1/tasks/dequeue \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer <token>' \
  -d '{"agent_id": "worker-03", "capabilities": ["code:rust"]}'
 ```
 Returns `200 OK` with a Task object, or `204 No Content` if nothing available.
 Only tasks with `execution_mode = http_pull` are returned.
 ### 4. Update Status While Working
 ```bash
 curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/status \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer <token>' \
  -d '{"status": "running"}'
 ```
 ### 5. Complete the Task
 ```bash
 curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/complete \
  -H 'Content-Type: application/json' \
  -d '{
    "task_id": "org/repo#42",
    "agent_id": "worker-03",
    "status": "completed",
    "duration_seconds": 180,
    "summary": "Fixed the issue",
    "artifacts": [{"artifact_type": "pr", "url": "https://git.example/org/repo/pulls/15"}],
    "error": null
  }'
 ```
 Or use the receipts endpoint:
 ```bash
 curl -X POST http://localhost:9090/api/v1/receipts \
  -H 'Content-Type: application/json' \
  -d '<same receipt body>'
 ```
 ### 6. Deregister When Done
 ```bash
 curl -X POST http://localhost:9090/api/v1/agents/deregister \
  -H 'Content-Type: application/json' \
  -d '{"agent_id": "worker-03"}'
 ```
 ---
 ## Forgejo Integration
 ### How Issues Become Tasks
 1. A Forgejo Issue is opened with a label matching `agent:*` (e.g. `agent:code`)
 2. Forgejo sends an `issues` webhook to `POST /api/v1/webhooks/forgejo`
 3. The `agent:*` label value becomes `task_type` (e.g. `code`)
 4. Priority is inferred from labels: `priority:urgent`, `priority:high`, `priority:low` (default: `normal`)
 5. A task is created with:
   - `task_id` = `{repo_full_name}#{issue_number}` (e.g. `org/repo#42`)
   - `execution_mode` = `ssh_cli` (default for Forgejo-originated tasks)
   - `branch_name` = `task/{url_encoded_task_id}` (e.g. `task/org%2Frepo%2342`)
   - `pr_title` = `feat: {issue_title} (#{issue_number})`
 ### Branch Naming Convention
 - Branch: `task/{url_encoded_task_id}`
 - Example: task `org/repo#42` → branch `task/org%2Frepo%2342`
 ### PR Lifecycle
 | Event | Effect |
 |-------|--------|
 | PR opened (branch = `task/*`) | Task → `review_pending` |
 | PR merged | Task → `completed`, auto receipt generated |
 | Push to `task/*` branch | Task `last_activity_at` updated |
 ### Task Status Flow
 ```
 created → assigned → running → review_pending → completed
                               ↘ failed
                  ↘ agent_lost
         ↘ cancelled
 ```
 Any `failed` or `agent_lost` task can be retried via `POST /api/v1/tasks/{task_id}/retry` (transitions to `assigned`). Retry is limited by `max_retries` (default: 2).
 ---
 ## Structured Prompt Format (ssh_cli)
 When the Orchestrator executes an agent via SSH, it constructs a structured prompt:
 ```
 Task ID: <task_id>
 Type: <task_type>
 Goal:
 <requirements>
 Constraints:
 - Execution mode: ssh_cli
 - Labels: <comma-separated labels or <none>>
 - Branch: <branch_name>
 - Expected output: JSON receipt
 Validation:
 - Run relevant tests if code changed
 - Summarize changes and artifacts
 ```
 The prompt is injected into the CLI template as the `{prompt}` variable. Other available variables: `{work_dir}`, `{task_id}`, `{branch}`.
 ---
 ## FAQ
 **Q: How do I know which execution mode to use?**
 A: If you have a CLI binary and run on a configured host → `ssh_cli`. If you have your own scheduler or run outside configured hosts → `http_pull`.
 **Q: Do I need to register for ssh_cli mode?**
 A: No. The Orchestrator manages ssh_cli tasks entirely. Registration is only for `http_pull` agents.
 **Q: What happens if my agent crashes during ssh_cli execution?**
 A: The task is marked `failed`. If `retry_count < max_retries`, the dispatch loop will retry automatically.
 **Q: What happens if my http_pull agent stops sending heartbeats?**
 A: After `heartbeat_interval_secs × heartbeat_timeout_threshold` seconds, the agent is marked offline and all its tasks are requeued with status `created`.
 **Q: Can a task switch between execution modes?**
 A: No. The `execution_mode` is set at creation time and cannot be changed.
 **Q: How do I create a task manually?**
 A: Use the Forgejo webhook flow (open an Issue with `agent:*` label), or directly insert into the database. There is no public "create task" API endpoint.
 **Q: What label format triggers task creation?**
 A: Issues must have a label starting with `agent:` (e.g. `agent:code`, `agent:review`). The value after `agent:` becomes the task type. Issues without such a label are ignored.
 **Q: How does the review loop work?**
 A: When a PR is opened (not merged), the task goes to `review_pending`. If the PR is not merged and the review cycle count exceeds `max_retries`, the task is marked `failed`. For `ssh_cli`, the Orchestrator re-dispatches automatically.
--- a/openspec/changes/agent-onboarding-docs/.openspec.yaml
+++ b/openspec/changes/agent-onboarding-docs/.openspec.yaml
@ -0,0 +1,2 @@
 schema: spec-driven
 created: 2026-05-12
--- a/openspec/changes/agent-onboarding-docs/design.md
+++ b/openspec/changes/agent-onboarding-docs/design.md
@ -0,0 +1,65 @@
 ## Context
 agent-fleet 核心功能已经实现并部署到 arm0 上运行。但没有任何 Agent 知道怎么用它。项目的可用性完全取决于 Agent 能否正确接入。
 需要两个交付物：
 1. **API 参考文档**：给 Agent 看的 HTTP API 手册
 2. **通用 Skill**：遵循标准 skill 规范的能力描述，不绑定特定平台
 关键约束：Skill 必须是平台无关的。承担 Team Leader 角色的不一定是 OpenClaw，Codex、Claude Code、OpenCode、Hermes Agent 都可能是调度者。
 ## Goals / Non-Goals
 **Goals:**
 - 提供完整、准确、可直接使用的 API 参考文档
 - 提供通用 Skill，任何 Agent 加载后就知道如何与 agent-fleet 交互
 - 覆盖两种执行模式（ssh_cli + http_pull）的完整工作流
 - 覆盖 Forgejo 集成的 Git 工作流
 **Non-Goals:**
 - 不写人类运维文档（部署、配置、排障）→ 这是另一个 change
 - 不写特定平台的集成脚本（如 OpenClaw skill 的安装脚本）
 - 不实现 SDK 或客户端库
 ## Decisions
 ### Decision 1: 通用 Skill 规范，不绑定平台
 **选择**: Skill 使用标准 YAML frontmatter + Markdown body 格式
 **理由**:
 - 所有主流 Agent 平台都支持这种格式（OpenClaw、Claude Code、Codex CLI、OpenCode）
 - 不包含任何平台特定语法，Agent 自行转换
 - curl 格式是通用语言，所有 Agent 都能理解
 **替代方案**:
 - OpenClaw 专用 skill：限制了使用范围
 - 多平台各自写：重复劳动，容易不一致
 ### Decision 2: 文档放在 repo 内
 **选择**: `docs/` 目录放 API 参考和接入指南，`skill/` 目录放 SKILL.md
 **理由**:
 - 与代码同仓库，版本一致
 - Agent 可以通过 Forgejo 直接读取文档
 - Skill 可以被各平台 fork 或 symlink
 ### Decision 3: 文档从代码自动生成 + 手动补充
 **选择**: API 端点列表手动维护（Phase 1），后续考虑从代码注释自动生成
 **理由**:
 - Phase 1 端点数量有限（~12 个），手动维护成本低
 - 自动生成需要额外工具链（如 `utoipa`），Phase 1 不值得投入
 ## Risks / Trade-offs
 - **[文档过时] 代码变更后文档可能不一致** → 文档与代码同仓库，PR review 时检查
 - **[Skill 通用性限制] 通用意味着不能利用平台特性** → 通用是正确选择，平台特定优化由各 Agent 自行处理
 ## Open Questions
 _(resolved)_
 - ~~Skill 是否需要包含多语言版本（中/英）？~~ → 全部使用英文。原因：LLM 训练语料以英文为主，英文更 token-efficient、语义歧义更小。Skill 的受众是 Agent 不是人类。
--- a/openspec/changes/agent-onboarding-docs/proposal.md
+++ b/openspec/changes/agent-onboarding-docs/proposal.md
@ -0,0 +1,37 @@
 ## Why
 agent-fleet 的所有核心功能（双执行模型、Forgejo 集成、Receipt 验证）已经实现并在 arm0 上跑通。但没有任何 Agent 知道如何使用它。
 当前状态：
 - API 端点已经实现（注册、心跳、dequeue、status、receipt、webhook 等）
 - 双执行模式（ssh_cli + http_pull）已经实现
 - 但没有任何文档告诉 Agent "怎么接入、怎么调 API、怎么配合工作流"
 项目的可用性完全取决于 Agent 能否正确接入。没有文档和 skill，agent-fleet 就是一个没人会用的 API。
 同时，需要的是一个**通用 skill**（不绑定 OpenClaw），因为：
 - 承担 Team Leader 角色的不一定是 OpenClaw
 - Codex、Claude Code、OpenCode、Hermes Agent 等都需要能理解和使用 agent-fleet
 - Skill 是通用的 Agent 能力描述，遵循通用规范
 ## What Changes
 - 新增 `docs/agent-api-reference.md`：完整的 HTTP API 参考文档，供任何 Agent 阅读
 - 新增 `docs/agent-onboarding-guide.md`：Agent 接入指南，包含两种模式的完整工作流程
 - 新增 `skill/` 目录：通用 Agent Skill 定义（SKILL.md），遵循通用 skill 规范
 - Skill 内容：API 调用方式、认证、任务生命周期、Forgejo 工作流、错误处理
 ## Capabilities
 ### New Capabilities
 - `agent-api-reference`: HTTP API 完整参考文档（端点、请求/响应格式、错误码、示例）
 - `agent-skill`: 通用 Agent Skill 定义，描述 Agent 如何与 agent-fleet 交互
 ### Modified Capabilities
 _(无)_
 ## Impact
 - **文档**：新增 2 个 Markdown 文档 + 1 个 Skill 定义
 - **代码**：无代码变更
 - **项目**：Skill 目录是新增结构，可能需要考虑放在 repo 的哪个位置
--- a/openspec/changes/agent-onboarding-docs/specs/agent-api-reference/spec.md
+++ b/openspec/changes/agent-onboarding-docs/specs/agent-api-reference/spec.md
@ -0,0 +1,43 @@
 ## ADDED Requirements
 ### Requirement: Complete HTTP API reference documentation
 项目 SHALL 提供完整的 HTTP API 参考文档（`docs/agent-api-reference.md`），供任何 Agent 阅读。文档 SHALL 覆盖所有公开端点，包含请求/响应格式、错误码、示例。
 #### Scenario: Agent reads API reference to understand available endpoints
 - **WHEN** Agent 阅读 `docs/agent-api-reference.md`
 - **THEN** 文档 SHALL 列出所有端点：healthz、agents/register、agents/heartbeat、agents/deregister、agents (GET)、tasks (GET)、tasks/{id} (GET)、tasks/dequeue、tasks/{id}/status、tasks/{id}/retry、tasks/{id}/complete、receipts、webhooks/forgejo
 - **AND** 每个端点 SHALL 包含：HTTP 方法、URL、请求体格式、响应格式、错误码、curl 示例
 #### Scenario: Agent checks authentication requirements
 - **WHEN** Agent 查看 API 参考的认证部分
 - **THEN** 文档 SHALL 说明：http_pull 模式需要 Bearer token（注册时获取），ssh_cli 模式不需要 Agent 认证，webhook 端点需要 HMAC-SHA256 签名
 #### Scenario: Agent understands error responses
 - **WHEN** Agent 收到错误响应
 - **THEN** 文档 SHALL 列出所有错误码：401 Unauthorized、403 Forbidden、404 Not Found、400 Bad Request、500 Internal Server Error
 - **AND** 每个错误码 SHALL 包含触发场景描述
 ### Requirement: Agent onboarding guide
 项目 SHALL 提供 Agent 接入指南（`docs/agent-onboarding-guide.md`），描述两种执行模式的完整工作流程。
 #### Scenario: New agent team leader reads onboarding guide
 - **WHEN** 新的 Team Leader Agent（如 Jeeves）阅读 onboarding guide
 - **THEN** 文档 SHALL 描述两种执行模式的区别和使用场景：
  - ssh_cli：Orchestrator 主动调度，适用于 Codex、Claude Code、OpenCode 等有 CLI 的 Agent
  - http_pull：Agent 自主拉取，适用于 OpenClaw/Jeeves、Hermes 等有自己的调度器的 Agent
 #### Scenario: Agent follows ssh_cli workflow
 - **WHEN** Agent 按 ssh_cli 模式接入
 - **THEN** 文档 SHALL 描述完整流程：配置 host → Agent 安装 CLI → Orchestrator 自动发现 → 任务自动分配和执行 → PR 创建 → webhook 回调
 #### Scenario: Agent follows http_pull workflow
 - **WHEN** Agent 按 http_pull 模式接入
 - **THEN** 文档 SHALL 描述完整流程：调用 register API → 获取 token → 定期 heartbeat → 调用 dequeue 拉任务 → 执行 → 调用 complete/receipt API
 #### Scenario: Agent understands Forgejo integration
 - **WHEN** Agent 阅读 Forgejo 集成部分
 - **THEN** 文档 SHALL 描述：Issue 如何变成任务（webhook → label 解析）、任务如何关联 Git 分支（`task/{task_id}`）、PR 生命周期如何驱动状态更新（opened → review_pending、merged → completed）
 #### Scenario: Agent understands structured prompt format
 - **WHEN** ssh_cli 模式的 Agent 需要理解传入的 prompt
 - **THEN** 文档 SHALL 描述结构化 prompt 的格式：Task ID、Type、Goal、Constraints、Branch、Expected output、Validation
--- a/openspec/changes/agent-onboarding-docs/specs/agent-skill/spec.md
+++ b/openspec/changes/agent-onboarding-docs/specs/agent-skill/spec.md
@ -0,0 +1,41 @@
 ## ADDED Requirements
 ### Requirement: Universal Agent Skill definition
 项目 SHALL 提供一个通用 Agent Skill（`skill/SKILL.md`），遵循标准 skill 规范（YAML frontmatter + Markdown body）。Skill SHALL 不绑定任何特定 Agent 平台（OpenClaw、Claude Code、Codex、OpenCode、Hermes 等均可使用）。
 #### Scenario: Any agent discovers and loads the skill
 - **WHEN** 任意 Agent（Codex、Claude Code、OpenCode、Hermes 等）加载 skill/SKILL.md
 - **THEN** Skill SHALL 包含 YAML frontmatter：`name: agent-fleet-integration`，`description` 描述用途和触发条件
 - **AND** Skill body SHALL 使用标准 Markdown 格式（标题、代码块、示例）
 #### Scenario: Skill teaches agent how to interact with agent-fleet
 - **WHEN** Agent 阅读 Skill 内容
 - **THEN** Skill SHALL 包含 Quick Start 部分（最简单的接入示例，3 步以内）
 - **AND** 包含 Instructions 部分（详细的 API 调用流程）
 - **AND** 包含 Examples 部分（每种操作的 curl 示例）
 - **AND** 包含 Guidelines 部分（错误处理、重试策略、认证规则）
 #### Scenario: Skill covers both execution modes
 - **WHEN** Agent 需要选择执行模式
 - **THEN** Skill SHALL 清晰说明 ssh_cli 和 http_pull 的区别
 - **AND** 指导 Agent 如何判断自己应该使用哪种模式：
  - 如果有 CLI 且在配置的主机上 → ssh_cli（由 Orchestrator 调度）
  - 如果有自己的调度器或不在配置的主机上 → http_pull（自主拉取）
 #### Scenario: Skill includes Forgejo workflow
 - **WHEN** Agent 需要理解 Git 工作流
 - **THEN** Skill SHALL 描述分支命名约定（`task/{task_id}`）、PR 创建流程、webhook 触发机制
 #### Scenario: Skill includes error recovery guidance
 - **WHEN** Agent 遇到 API 错误
 - **THEN** Skill SHALL 提供常见错误的处理方式：
  - 401 → 检查 token，必要时重新注册
  - 404 → 任务可能已完成或不存在
  - 409/400 → 检查任务状态是否允许该操作
  - 网络错误 → 重试（指数退避）
 #### Scenario: Skill is portable across agent platforms
 - **WHEN** Skill 被不同平台的 Agent 使用
 - **THEN** Skill SHALL 不包含任何平台特定的语法或指令（如 OpenClaw 的 `sessions_send`、Claude Code 的 `hooks` 等）
 - **AND** 所有交互通过标准 HTTP 请求描述（curl 格式）
 - **AND** Agent 可根据自身能力将 curl 转换为对应的 HTTP 调用方式
--- a/openspec/changes/agent-onboarding-docs/tasks.md
+++ b/openspec/changes/agent-onboarding-docs/tasks.md
@ -0,0 +1,36 @@
 ## 1. API 参考文档
 - [ ] 1.1 创建 `docs/agent-api-reference.md`
 - [ ] 1.2 列出所有公开端点（~12 个），每个包含：HTTP 方法、URL、请求体、响应体、错误码、curl 示例
 - [ ] 1.3 认证部分：http_pull token、webhook HMAC-SHA256 签名
 - [ ] 1.4 错误码汇总：401/403/404/400/500，每个附触发场景
 - [ ] 1.5 通用说明：base_url、Content-Type、字符编码、分页（如有）
 ## 2. Agent 接入指南
 - [ ] 2.1 创建 `docs/agent-onboarding-guide.md`
 - [ ] 2.2 两种执行模式对比表（ssh_cli vs http_pull）
 - [ ] 2.3 ssh_cli 模式完整工作流：配置 host → CLI 安装 → 自动调度 → PR 工作流
 - [ ] 2.4 http_pull 模式完整工作流：register → heartbeat → dequeue → execute → complete/receipt
 - [ ] 2.5 Forgejo 集成说明：Issue → Task、分支命名、PR 生命周期
 - [ ] 2.6 结构化 prompt 格式说明（ssh_cli 模式下 Agent 收到的 prompt 结构）
 - [ ] 2.7 常见问题 FAQ
 ## 3. 通用 Agent Skill
 - [ ] 3.1 创建 `skill/SKILL.md`（YAML frontmatter + Markdown body）
 - [ ] 3.2 Quick Start：最简接入示例（3 步以内）
 - [ ] 3.3 Instructions：详细 API 调用流程（register → heartbeat → dequeue → execute → complete）
 - [ ] 3.4 Examples：每种操作的 curl 示例
 - [ ] 3.5 Guidelines：错误处理、重试策略、认证规则
 - [ ] 3.6 执行模式选择指南：Agent 如何判断自己用 ssh_cli 还是 http_pull
 - [ ] 3.7 Forgejo 工作流说明（分支命名、PR 创建、webhook 触发）
 - [ ] 3.8 验证：Skill 内容与 API 参考文档一致、curl 示例可执行
 ## 4. 验证
 - [ ] 4.1 API 参考文档覆盖所有已实现端点
 - [ ] 4.2 curl 示例基于 arm0 实例可执行
 - [ ] 4.3 Skill 格式符合标准规范（YAML frontmatter + Markdown body）
 - [ ] 4.4 Skill 不包含任何平台特定语法
 - [ ] 4.5 接入指南与当前代码实现一致
--- a/skill/SKILL.md
+++ b/skill/SKILL.md
@ -0,0 +1,281 @@
 ---
 name: agent-fleet-integration
 description: |
  Interact with the Agent Fleet Orchestrator. Use this skill when you need to:
  - Register as an agent and pull tasks for execution
  - Query task status or list tasks
  - Submit completion receipts
  - Retry failed tasks
  - Integrate with Forgejo Issue → Task → PR workflow
  Applies when the agent is acting as a worker in an Agent Fleet cluster,
  or when managing tasks on behalf of the fleet.
 ---
 # Agent Fleet Integration Skill
 ## Quick Start (http_pull mode)
 **Step 1.** Register your agent:
 ```bash
 curl -X POST http://localhost:9090/api/v1/agents/register \
  -H 'Content-Type: application/json' \
  -d '{"agent_id":"my-agent","agent_type":"openclaw","hostname":"myhost","capabilities":["code:rust"],"max_concurrency":2}'
 ```
 **Step 2.** Pull and execute a task:
 ```bash
 curl -X POST http://localhost:9090/api/v1/tasks/dequeue \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer <token>' \
  -d '{"agent_id":"my-agent","capabilities":["code:rust"]}'
 ```
 **Step 3.** Submit your result:
 ```bash
 curl -X POST http://localhost:9090/api/v1/tasks/<task_id>/complete \
  -H 'Content-Type: application/json' \
  -d '{"task_id":"<task_id>","agent_id":"my-agent","status":"completed","duration_seconds":60,"summary":"done","artifacts":[],"error":null}'
 ```
 ---
 ## Choosing Your Execution Mode
 | If you... | Use this mode |
 |-----------|---------------|
 | Have a CLI binary installed on a configured host | `ssh_cli` — Orchestrator calls you |
 | Have your own scheduler or run outside configured hosts | `http_pull` — You call the API |
 - `ssh_cli` agents do **not** need to call any API. The Orchestrator handles everything via SSH or local subprocess.
 - `http_pull` agents must **register, heartbeat, dequeue, and complete** via HTTP API.
 ---
 ## Instructions
 ### http_pull Agent Lifecycle
 ```
 Register → Heartbeat (loop) → Dequeue → Execute → Complete/Deregister
 ```
 1. **Register** once at startup via `POST /api/v1/agents/register`.
 2. **Heartbeat** periodically (every 60s recommended) via `POST /api/v1/agents/heartbeat`. Without heartbeats, you will be marked offline and your tasks requeued.
 3. **Dequeue** when ready for work via `POST /api/v1/tasks/dequeue`. Returns a Task or 204 No Content.
 4. **Update status** to `running` via `POST /api/v1/tasks/{task_id}/status`.
 5. **Complete** the task via `POST /api/v1/tasks/{task_id}/complete` with a Receipt.
 6. **Deregister** when shutting down via `POST /api/v1/agents/deregister`.
 ### ssh_cli Agent Notes
 No API interaction required. Ensure:
 - Your CLI binary is in `$PATH` on the configured host.
 - Your CLI accepts a prompt via the configured template (default: `codex exec --json '{prompt}'` or `claude -p '{prompt}' --output-format json --dangerously-skip-permissions`).
 - Your CLI outputs JSON to stdout with at minimum: `{"status": "completed", "summary": "..."}`.
 ---
 ## Examples
 ### Register
 ```bash
 curl -X POST http://localhost:9090/api/v1/agents/register \
  -H 'Content-Type: application/json' \
  -d '{
    "agent_id": "worker-03",
    "agent_type": "openclaw",
    "hostname": "arm0",
    "capabilities": ["code:rust", "review"],
    "max_concurrency": 2,
    "metadata": {"version": "1.0"}
  }'
 ```
 ### Heartbeat
 ```bash
 curl -X POST http://localhost:9090/api/v1/agents/heartbeat \
  -H 'Content-Type: application/json' \
  -d '{"agent_id": "worker-03"}'
 ```
 ### List Available Tasks
 ```bash
 curl 'http://localhost:9090/api/v1/tasks?status=created'
 ```
 ### Dequeue
 ```bash
 curl -X POST http://localhost:9090/api/v1/tasks/dequeue \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer my-token' \
  -d '{"agent_id": "worker-03", "capabilities": ["code:rust"]}'
 ```
 Returns 200 with Task JSON, or 204 if no matching task.
 ### Get Task Detail
 ```bash
 curl 'http://localhost:9090/api/v1/tasks/org%2Frepo%2342'
 ```
 ### Update Task Status
 ```bash
 curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/status \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer my-token' \
  -d '{"status": "running"}'
 ```
 ### Complete Task with Receipt
 ```bash
 curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/complete \
  -H 'Content-Type: application/json' \
  -d '{
    "task_id": "org/repo#42",
    "agent_id": "worker-03",
    "status": "completed",
    "duration_seconds": 180,
    "summary": "Implemented the feature as described",
    "artifacts": [
      {"artifact_type": "pr", "url": "https://git.example/org/repo/pulls/15"}
    ],
    "error": null
  }'
 ```
 ### Submit Receipt
 ```bash
 curl -X POST http://localhost:9090/api/v1/receipts \
  -H 'Content-Type: application/json' \
  -d '{
    "task_id": "org/repo#42",
    "agent_id": "worker-03",
    "status": "completed",
    "duration_seconds": 180,
    "summary": "Done",
    "artifacts": [],
    "error": null
  }'
 ```
 ### Retry a Failed Task
 ```bash
 curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/retry
 ```
 Only works for tasks in `failed` or `agent_lost` status.
 ### List Agents
 ```bash
 curl 'http://localhost:9090/api/v1/agents?status=online&capability=code:rust'
 ```
 ### Deregister
 ```bash
 curl -X POST http://localhost:9090/api/v1/agents/deregister \
  -H 'Content-Type: application/json' \
  -d '{"agent_id": "worker-03"}'
 ```
 ### Health Check
 ```bash
 curl http://localhost:9090/healthz
 ```
 ---
 ## Guidelines
 ### Authentication
 - **http_pull endpoints** (`dequeue`, `status update`): require `Authorization: Bearer <token>` if `http_pull_token` is configured. If not configured, no auth is needed.
 - **All other endpoints**: no authentication required.
 - **Webhook endpoint**: requires HMAC-SHA256 signature header.
 ### Error Handling
 | Code | Meaning | Action |
 |------|---------|--------|
 | 401 | Unauthorized | Check your Bearer token. If expired, re-register to get a new one. |
 | 404 | Not Found | Task may have been completed or never existed. Move on. |
 | 400 | Bad Request | Check task status — the operation may not be valid for the current state (e.g. retrying a `running` task). |
 | 204 | No Content (dequeue) | No matching tasks available. Wait and retry. |
 | 500 | Server Error | Retry with exponential backoff. Report if persistent. |
 ### Retry Strategy
 - Use exponential backoff for transient errors (network, 500s): 1s, 2s, 4s, 8s, max 30s.
 - Do not retry 400 errors — fix your request.
 - For 404 on dequeue: poll again after a reasonable interval (e.g. 10–30 seconds).
 - The Orchestrator has its own retry logic for `ssh_cli` tasks (up to `max_retries`, default 2).
 ### Task Status Flow
 ```
 created → assigned → running → review_pending → completed
                               ↘ failed
                  ↘ agent_lost
         ↘ cancelled
 ```
 - `failed` and `agent_lost` tasks can be retried via the retry endpoint.
 - `review_pending` means a PR was opened and is awaiting merge/review.
 - `completed` and `cancelled` are terminal states.
 ### Heartbeat Requirements
 - Send heartbeats at least every `heartbeat_interval_secs` (default: 60s).
 - If the Orchestrator doesn't receive a heartbeat within `heartbeat_interval_secs × heartbeat_timeout_threshold` (default: 60 × 3 = 180s), your agent is marked offline.
 - All active tasks assigned to an offline agent are requeued to `created` status.
 ---
 ## Forgejo Workflow
 ### Task Creation (Issue → Task)
 1. Open a Forgejo Issue with a label `agent:<type>` (e.g. `agent:code`).
 2. The webhook creates a task with `task_id = {repo}#{issue_number}`.
 3. Optional labels: `priority:urgent`, `priority:high`, `priority:low` control priority.
 ### Branch Naming
 - Branch: `task/{url_encoded_task_id}`
 - Example: `org/repo#42` → branch `task/org%2Frepo%2342`
 ### PR Workflow
 1. Work on the `task/*` branch.
 2. Open a PR from that branch.
 3. Orchestrator receives `pull_request.opened` webhook → task goes to `review_pending`.
 4. Pushes to the branch update `last_activity_at`.
 5. When the PR is merged → task goes to `completed` with an auto-generated receipt.
 ### For http_pull Agents
 After dequeuing a task, create the branch and PR yourself:
 ```bash
 git checkout -b task/org%2Frepo%2342
 # ... do the work ...
 git push origin task/org%2Frepo%2342
 # Create PR via Forgejo API
 # The webhook will update the task automatically
 ```
 ### For ssh_cli Agents
 The Orchestrator passes the branch name in the structured prompt. Create the branch, push, and open the PR as part of your CLI execution. The webhooks handle status updates.