docs: add agent API reference, onboarding guide, and universal skill

- docs/agent-api-reference.md (473 lines): complete HTTP API reference for all 12 endpoints
- docs/agent-onboarding-guide.md (272 lines): ssh_cli and http_pull workflows, Forgejo integration
- skill/SKILL.md (281 lines): universal agent skill, platform-agnostic, curl-based examples

All content in English. No code changes.
This commit is contained in:
Zer4tul 2026-05-12 14:57:05 +08:00
parent e39a16498c
commit d1a746a8cb
9 changed files with 1250 additions and 0 deletions

473
docs/agent-api-reference.md Normal file
View file

@ -0,0 +1,473 @@
# Agent Fleet — HTTP API Reference
Base URL: `http://<host>:9090`
Content-Type: `application/json` for all request/response bodies unless noted.
All timestamps are ISO 8601 (RFC 3339).
---
## Authentication
### http_pull Bearer Token
Endpoints that are specific to `http_pull` agents require a Bearer token in the `Authorization` header. The token is configured in `config.toml` as `orchestrator.http_pull_token`. If no token is configured in the config, authentication is skipped (open mode).
```
Authorization: Bearer <http_pull_token>
```
Affected endpoints: `POST /api/v1/tasks/dequeue`, `POST /api/v1/tasks/{task_id}/status`.
### Webhook HMAC-SHA256
The `POST /api/v1/webhooks/forgejo` endpoint requires an `X-Hub-Signature-256` (or `X-Gitea-Signature` / `X-Forgejo-Signature`) header containing `sha256=<hex_hmac>` of the request body using the configured `webhook_secret`.
```
X-Hub-Signature-256: sha256=abcdef...
```
---
## Error Responses
All errors return JSON:
```json
{ "error": "<human-readable message>" }
```
| Status | Meaning | Trigger |
|--------|---------|---------|
| 400 | Bad Request | Invalid state transition, wrong execution_mode, malformed input |
| 401 | Unauthorized | Missing or invalid Bearer token for http_pull endpoints |
| 404 | Not Found | Task or agent does not exist |
| 500 | Internal Server Error | Database failure, lock poisoning, unexpected errors |
---
## Endpoints
### Health Check
```
GET /healthz
```
**Response:** `200 OK` — body: `ok`
```bash
curl http://localhost:9090/healthz
```
---
### Register Agent
```
POST /api/v1/agents/register
```
Register a new agent or update an existing one (upsert by `agent_id`).
**Request:**
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| agent_id | string | yes | Unique identifier |
| agent_type | string | yes | `openclaw`, `claude-code`, `codex-cli`, `hermes`, `acp`, `shell`, or custom |
| hostname | string | yes | Machine hostname |
| capabilities | string[] | yes | e.g. `["code:rust", "review"]` |
| max_concurrency | u32 | yes | Max parallel tasks |
| metadata | object | no | Arbitrary key-value pairs |
**Response:** `200 OK`
```json
{
"agent_id": "worker-01",
"registry_token": "registry_a1b2c3d4..."
}
```
```bash
curl -X POST http://localhost:9090/api/v1/agents/register \
-H 'Content-Type: application/json' \
-d '{
"agent_id": "worker-01",
"agent_type": "codex-cli",
"hostname": "host-worker-01",
"capabilities": ["code:rust"],
"max_concurrency": 2
}'
```
---
### Heartbeat
```
POST /api/v1/agents/heartbeat
```
**Request:**
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| agent_id | string | yes | Agent to update |
**Response:** `200 OK`
```json
{
"agent_id": "worker-01",
"status": "online",
"last_heartbeat_at": "2025-01-15T10:30:00Z"
}
```
**Errors:** `404` if agent not found.
```bash
curl -X POST http://localhost:9090/api/v1/agents/heartbeat \
-H 'Content-Type: application/json' \
-d '{"agent_id": "worker-01"}'
```
---
### Deregister Agent
```
POST /api/v1/agents/deregister
```
Sets agent offline and requeues all its active tasks back to `created`.
**Request:**
| Field | Type | Required |
|-------|------|----------|
| agent_id | string | yes |
**Response:** `200 OK`
```json
{
"agent_id": "worker-01",
"status": "offline",
"requeued_tasks": 3
}
```
```bash
curl -X POST http://localhost:9090/api/v1/agents/deregister \
-H 'Content-Type: application/json' \
-d '{"agent_id": "worker-01"}'
```
---
### List Agents
```
GET /api/v1/agents
```
**Query Parameters:**
| Param | Type | Description |
|-------|------|-------------|
| capability | string | Filter by capability (e.g. `code:rust`) |
| status | string | Filter: `online`, `offline`, `draining` |
**Response:** `200 OK` — JSON array of [Agent](#agent-object) objects.
```bash
curl 'http://localhost:9090/api/v1/agents?status=online'
```
---
### List Tasks
```
GET /api/v1/tasks
```
**Query Parameters:**
| Param | Type | Description |
|-------|------|-------------|
| status | string | Filter by status (e.g. `created`, `running`, `failed`) |
| agent_id | string | Filter by assigned agent |
**Response:** `200 OK` — JSON array of [Task](#task-object) objects. Ordered by `created_at` descending.
```bash
curl 'http://localhost:9090/api/v1/tasks?status=running'
```
---
### Get Task
```
GET /api/v1/tasks/{task_id}
```
**Response:** `200 OK` — single [Task](#task-object) object.
**Errors:** `404` if task not found.
```bash
curl http://localhost:9090/api/v1/tasks/org%2Frepo%2342
```
---
### Dequeue Task (http_pull only)
```
POST /api/v1/tasks/dequeue
```
Requires Bearer token if `http_pull_token` is configured. Only returns tasks with `execution_mode = http_pull`.
**Request:**
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| agent_id | string | yes | Agent claiming the task |
| capabilities | string[] | no | Capabilities to match against task labels |
**Response:** `200 OK` with [Task](#task-object) object, or `204 No Content` if no matching task.
**Errors:** `401` if token required and missing/invalid.
```bash
curl -X POST http://localhost:9090/api/v1/tasks/dequeue \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer my-token' \
-d '{"agent_id": "worker-03", "capabilities": ["code:rust"]}'
```
---
### Update Task Status (http_pull only)
```
POST /api/v1/tasks/{task_id}/status
```
Requires Bearer token. Only works for tasks with `execution_mode = http_pull`.
**Request:**
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| status | string | yes | Target status: `running`, `review_pending`, etc. |
**Response:** `200 OK` — updated [Task](#task-object).
**Errors:** `400` if task is not `http_pull` mode or transition is invalid. `404` if task not found.
```bash
curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/status \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer my-token' \
-d '{"status": "running"}'
```
---
### Complete Task
```
POST /api/v1/tasks/{task_id}/complete
```
Works for both `ssh_cli` and `http_pull` tasks. Submit a receipt to mark the task done.
**Request:** A [Receipt](#receipt-object) object.
**Response:** `200 OK`
```json
{
"task_id": "org/repo#42",
"status": "completed"
}
```
**Errors:** `404` if task not found. `400` if task is not in a completable state.
```bash
curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/complete \
-H 'Content-Type: application/json' \
-d '{
"task_id": "org/repo#42",
"agent_id": "worker-01",
"status": "completed",
"duration_seconds": 120,
"summary": "Implemented feature X",
"artifacts": [
{"artifact_type": "pr", "url": "https://git.example/org/repo/pulls/7"}
],
"error": null
}'
```
---
### Retry Task
```
POST /api/v1/tasks/{task_id}/retry
```
Retry a `failed` or `agent_lost` task. Transitions it back to `assigned`.
**Response:** `200 OK` — updated [Task](#task-object).
**Errors:** `400` if task status is not `failed` or `agent_lost`. `404` if task not found.
```bash
curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/retry
```
---
### Submit Receipt
```
POST /api/v1/receipts
```
Submit a receipt for a task. Validates artifacts (e.g. checks PR exists via Forgejo API).
**Request:** A [Receipt](#receipt-object) object.
**Response:** `200 OK`
**Errors:** `404` if task not found. `400` if validation fails.
```bash
curl -X POST http://localhost:9090/api/v1/receipts \
-H 'Content-Type: application/json' \
-d '{
"task_id": "org/repo#42",
"agent_id": "worker-01",
"status": "completed",
"duration_seconds": 95,
"summary": "Fixed the bug",
"artifacts": [],
"error": null
}'
```
---
### Forgejo Webhook
```
POST /api/v1/webhooks/forgejo
```
Receives Forgejo webhook events. Requires HMAC-SHA256 signature header.
**Headers:** `X-Forgejo-Event` or `X-Gitea-Event` determines the event type.
**Supported events:**
| Event | Action |
|-------|--------|
| `issues` (opened) | Creates a task from the Issue (requires `agent:*` label) |
| `pull_request` (opened) | Sets task to `review_pending` (branch name → task_id) |
| `pull_request` (merged/closed with `merged: true`) | Sets task to `completed`, auto-generates receipt |
| `push` (to `task/*` branch) | Updates `last_activity_at` on the task |
**Response:** `200 OK`
```json
{
"accepted": true,
"task_id": "org/repo#42"
}
```
**Errors:** `401` if signature invalid. `400` if payload unparseable.
---
## Object Schemas
### Agent Object
```json
{
"agent_id": "worker-01",
"agent_type": "codex-cli",
"hostname": "host-worker-01",
"capabilities": ["code:rust"],
"max_concurrency": 2,
"current_tasks": 1,
"status": "online",
"last_heartbeat_at": "2025-01-15T10:30:00Z",
"registered_at": "2025-01-15T09:00:00Z",
"metadata": {}
}
```
### Task Object
```json
{
"task_id": "org/repo#42",
"source": "forgejo:org/repo#42",
"task_type": "code",
"priority": "normal",
"status": "created",
"execution_mode": "ssh_cli",
"assigned_agent_id": null,
"assigned_host": null,
"requirements": "Implement the feature described in the issue body",
"labels": ["agent:code", "code:rust"],
"branch_name": "task/org%2Frepo%2342",
"pr_title": "feat: Implement feature (#42)",
"created_at": "2025-01-15T10:00:00Z",
"assigned_at": null,
"started_at": null,
"completed_at": null,
"last_activity_at": null,
"retry_count": 0,
"max_retries": 2,
"review_count": 0,
"timeout_seconds": 1800
}
```
**Status values:** `created`, `assigned`, `running`, `review_pending`, `completed`, `failed`, `agent_lost`, `cancelled`
**Priority values:** `low`, `normal`, `high`, `urgent`
**Execution mode values:** `ssh_cli`, `http_pull`
### Receipt Object
```json
{
"task_id": "org/repo#42",
"agent_id": "worker-01",
"status": "completed",
"duration_seconds": 120,
"summary": "Implemented the feature",
"artifacts": [
{"artifact_type": "pr", "url": "https://git.example/org/repo/pulls/7", "path": null, "description": null}
],
"error": null
}
```
**Receipt status values:** `completed`, `failed`, `partial`
**Artifact type values:** `pr`, `commit`, `file`, `comment`, `url`

View file

@ -0,0 +1,272 @@
# Agent Fleet — Agent Onboarding Guide
This guide explains how to integrate an agent with the Agent Fleet Orchestrator.
---
## Execution Modes
Agent Fleet supports two execution modes. The mode is set per-task at creation time (defaults to `ssh_cli`).
| Aspect | `ssh_cli` | `http_pull` |
|--------|-----------|-------------|
| Who initiates? | Orchestrator (via SSH or local subprocess) | Agent (via HTTP API) |
| Control flow | Orchestrator builds prompt, runs CLI, collects output | Agent decides when to dequeue and execute |
| Agent requirements | CLI binary on a configured host | HTTP client, can call REST API |
| Auth needed? | No (Orchestrator manages) | Yes (Bearer token) |
| Best for | Codex CLI, Claude Code, OpenCode — agents with CLIs | OpenClaw/Jeeves, Hermes — agents with their own schedulers |
| Task creation trigger | Forgejo Issue webhook (default) | Same, or API call |
---
## ssh_cli Workflow
### 1. Configure a Host
Add a `[[hosts]]` section to `config.toml` on the Orchestrator:
```toml
[[hosts]]
host_id = "host-worker-01"
hostname = "192.168.1.100"
ssh_user = "deploy"
ssh_port = 22
ssh_key_path = "/home/deploy/.ssh/id_ed25519"
work_dir = "/opt/agent-workspace"
agents = [
{ agent_type = "codex-cli", max_concurrency = 2, capabilities = ["code:rust", "code:python"] },
]
```
For local execution (same machine as Orchestrator), use `hostname = "localhost"` — the Orchestrator uses a local subprocess instead of SSH.
### 2. Install the Agent CLI
The CLI binary must be available on the target host in `$PATH`. The Orchestrator checks availability with `which <binary>`.
Built-in CLI templates:
| Agent Type | CLI Command |
|------------|-------------|
| `codex-cli` | `codex exec --json '{prompt}'` |
| `claude-code` | `claude -p '{prompt}' --output-format json --dangerously-skip-permissions` |
Custom templates can be defined in `config.toml` under `[adapters]`.
### 3. Orchestrator Handles Everything
When a Forgejo Issue with an `agent:*` label arrives:
1. Orchestrator creates a task (`execution_mode = ssh_cli`)
2. Dispatch loop picks the task, selects a host by capability + load
3. SSH (or local subprocess) executes the CLI with a structured prompt
4. Output is parsed (Codex JSON or Claude JSON format)
5. Task status updates: `created``assigned``running``completed` (or `failed`)
### 4. What the Agent Receives (Structured Prompt)
The Orchestrator constructs this prompt and passes it as the `{prompt}` variable:
```
Task ID: org/repo#42
Type: code
Goal:
Implement the feature described in the issue body
Constraints:
- Execution mode: ssh_cli
- Labels: code:rust
- Branch: task/org%2Frepo%2342
- Expected output: JSON receipt
Validation:
- Run relevant tests if code changed
- Summarize changes and artifacts
```
### 5. Expected CLI Output
The CLI must output JSON to stdout. The format depends on the parser:
**Codex JSON:**
```json
{"status": "completed", "summary": "done", "duration_seconds": 120, "artifacts": [{"artifact_type": "pr", "url": "https://..."}]}
```
**Claude JSON:**
```json
{"status": "completed", "summary": "done", "duration_seconds": 95, "error": null}
```
If output is not valid JSON, the task is marked `failed`.
---
## http_pull Workflow
### 1. Register
```bash
curl -X POST http://localhost:9090/api/v1/agents/register \
-H 'Content-Type: application/json' \
-d '{"agent_id": "worker-03", "agent_type": "openclaw", "hostname": "arm0", "capabilities": ["code:rust"], "max_concurrency": 2}'
```
Response contains a `registry_token`. Keep it for subsequent API calls (if `http_pull_token` is configured, use that shared token instead).
### 2. Heartbeat (periodic)
Send a heartbeat every N seconds (default interval: 60s). If the Orchestrator doesn't receive one within `heartbeat_interval_secs × heartbeat_timeout_threshold`, the agent is marked offline and its tasks are requeued.
```bash
curl -X POST http://localhost:9090/api/v1/agents/heartbeat \
-H 'Content-Type: application/json' \
-d '{"agent_id": "worker-03"}'
```
### 3. Dequeue a Task
```bash
curl -X POST http://localhost:9090/api/v1/tasks/dequeue \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <token>' \
-d '{"agent_id": "worker-03", "capabilities": ["code:rust"]}'
```
Returns `200 OK` with a Task object, or `204 No Content` if nothing available.
Only tasks with `execution_mode = http_pull` are returned.
### 4. Update Status While Working
```bash
curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/status \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <token>' \
-d '{"status": "running"}'
```
### 5. Complete the Task
```bash
curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/complete \
-H 'Content-Type: application/json' \
-d '{
"task_id": "org/repo#42",
"agent_id": "worker-03",
"status": "completed",
"duration_seconds": 180,
"summary": "Fixed the issue",
"artifacts": [{"artifact_type": "pr", "url": "https://git.example/org/repo/pulls/15"}],
"error": null
}'
```
Or use the receipts endpoint:
```bash
curl -X POST http://localhost:9090/api/v1/receipts \
-H 'Content-Type: application/json' \
-d '<same receipt body>'
```
### 6. Deregister When Done
```bash
curl -X POST http://localhost:9090/api/v1/agents/deregister \
-H 'Content-Type: application/json' \
-d '{"agent_id": "worker-03"}'
```
---
## Forgejo Integration
### How Issues Become Tasks
1. A Forgejo Issue is opened with a label matching `agent:*` (e.g. `agent:code`)
2. Forgejo sends an `issues` webhook to `POST /api/v1/webhooks/forgejo`
3. The `agent:*` label value becomes `task_type` (e.g. `code`)
4. Priority is inferred from labels: `priority:urgent`, `priority:high`, `priority:low` (default: `normal`)
5. A task is created with:
- `task_id` = `{repo_full_name}#{issue_number}` (e.g. `org/repo#42`)
- `execution_mode` = `ssh_cli` (default for Forgejo-originated tasks)
- `branch_name` = `task/{url_encoded_task_id}` (e.g. `task/org%2Frepo%2342`)
- `pr_title` = `feat: {issue_title} (#{issue_number})`
### Branch Naming Convention
- Branch: `task/{url_encoded_task_id}`
- Example: task `org/repo#42` → branch `task/org%2Frepo%2342`
### PR Lifecycle
| Event | Effect |
|-------|--------|
| PR opened (branch = `task/*`) | Task → `review_pending` |
| PR merged | Task → `completed`, auto receipt generated |
| Push to `task/*` branch | Task `last_activity_at` updated |
### Task Status Flow
```
created → assigned → running → review_pending → completed
↘ failed
↘ agent_lost
↘ cancelled
```
Any `failed` or `agent_lost` task can be retried via `POST /api/v1/tasks/{task_id}/retry` (transitions to `assigned`). Retry is limited by `max_retries` (default: 2).
---
## Structured Prompt Format (ssh_cli)
When the Orchestrator executes an agent via SSH, it constructs a structured prompt:
```
Task ID: <task_id>
Type: <task_type>
Goal:
<requirements>
Constraints:
- Execution mode: ssh_cli
- Labels: <comma-separated labels or <none>>
- Branch: <branch_name>
- Expected output: JSON receipt
Validation:
- Run relevant tests if code changed
- Summarize changes and artifacts
```
The prompt is injected into the CLI template as the `{prompt}` variable. Other available variables: `{work_dir}`, `{task_id}`, `{branch}`.
---
## FAQ
**Q: How do I know which execution mode to use?**
A: If you have a CLI binary and run on a configured host → `ssh_cli`. If you have your own scheduler or run outside configured hosts → `http_pull`.
**Q: Do I need to register for ssh_cli mode?**
A: No. The Orchestrator manages ssh_cli tasks entirely. Registration is only for `http_pull` agents.
**Q: What happens if my agent crashes during ssh_cli execution?**
A: The task is marked `failed`. If `retry_count < max_retries`, the dispatch loop will retry automatically.
**Q: What happens if my http_pull agent stops sending heartbeats?**
A: After `heartbeat_interval_secs × heartbeat_timeout_threshold` seconds, the agent is marked offline and all its tasks are requeued with status `created`.
**Q: Can a task switch between execution modes?**
A: No. The `execution_mode` is set at creation time and cannot be changed.
**Q: How do I create a task manually?**
A: Use the Forgejo webhook flow (open an Issue with `agent:*` label), or directly insert into the database. There is no public "create task" API endpoint.
**Q: What label format triggers task creation?**
A: Issues must have a label starting with `agent:` (e.g. `agent:code`, `agent:review`). The value after `agent:` becomes the task type. Issues without such a label are ignored.
**Q: How does the review loop work?**
A: When a PR is opened (not merged), the task goes to `review_pending`. If the PR is not merged and the review cycle count exceeds `max_retries`, the task is marked `failed`. For `ssh_cli`, the Orchestrator re-dispatches automatically.