docs: add agent API reference, onboarding guide, and universal skill
- docs/agent-api-reference.md (473 lines): complete HTTP API reference for all 12 endpoints - docs/agent-onboarding-guide.md (272 lines): ssh_cli and http_pull workflows, Forgejo integration - skill/SKILL.md (281 lines): universal agent skill, platform-agnostic, curl-based examples All content in English. No code changes.
This commit is contained in:
parent
e39a16498c
commit
d1a746a8cb
9 changed files with 1250 additions and 0 deletions
473
docs/agent-api-reference.md
Normal file
473
docs/agent-api-reference.md
Normal file
|
|
@ -0,0 +1,473 @@
|
|||
# Agent Fleet — HTTP API Reference
|
||||
|
||||
Base URL: `http://<host>:9090`
|
||||
Content-Type: `application/json` for all request/response bodies unless noted.
|
||||
All timestamps are ISO 8601 (RFC 3339).
|
||||
|
||||
---
|
||||
|
||||
## Authentication
|
||||
|
||||
### http_pull Bearer Token
|
||||
|
||||
Endpoints that are specific to `http_pull` agents require a Bearer token in the `Authorization` header. The token is configured in `config.toml` as `orchestrator.http_pull_token`. If no token is configured in the config, authentication is skipped (open mode).
|
||||
|
||||
```
|
||||
Authorization: Bearer <http_pull_token>
|
||||
```
|
||||
|
||||
Affected endpoints: `POST /api/v1/tasks/dequeue`, `POST /api/v1/tasks/{task_id}/status`.
|
||||
|
||||
### Webhook HMAC-SHA256
|
||||
|
||||
The `POST /api/v1/webhooks/forgejo` endpoint requires an `X-Hub-Signature-256` (or `X-Gitea-Signature` / `X-Forgejo-Signature`) header containing `sha256=<hex_hmac>` of the request body using the configured `webhook_secret`.
|
||||
|
||||
```
|
||||
X-Hub-Signature-256: sha256=abcdef...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Error Responses
|
||||
|
||||
All errors return JSON:
|
||||
|
||||
```json
|
||||
{ "error": "<human-readable message>" }
|
||||
```
|
||||
|
||||
| Status | Meaning | Trigger |
|
||||
|--------|---------|---------|
|
||||
| 400 | Bad Request | Invalid state transition, wrong execution_mode, malformed input |
|
||||
| 401 | Unauthorized | Missing or invalid Bearer token for http_pull endpoints |
|
||||
| 404 | Not Found | Task or agent does not exist |
|
||||
| 500 | Internal Server Error | Database failure, lock poisoning, unexpected errors |
|
||||
|
||||
---
|
||||
|
||||
## Endpoints
|
||||
|
||||
### Health Check
|
||||
|
||||
```
|
||||
GET /healthz
|
||||
```
|
||||
|
||||
**Response:** `200 OK` — body: `ok`
|
||||
|
||||
```bash
|
||||
curl http://localhost:9090/healthz
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Register Agent
|
||||
|
||||
```
|
||||
POST /api/v1/agents/register
|
||||
```
|
||||
|
||||
Register a new agent or update an existing one (upsert by `agent_id`).
|
||||
|
||||
**Request:**
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| agent_id | string | yes | Unique identifier |
|
||||
| agent_type | string | yes | `openclaw`, `claude-code`, `codex-cli`, `hermes`, `acp`, `shell`, or custom |
|
||||
| hostname | string | yes | Machine hostname |
|
||||
| capabilities | string[] | yes | e.g. `["code:rust", "review"]` |
|
||||
| max_concurrency | u32 | yes | Max parallel tasks |
|
||||
| metadata | object | no | Arbitrary key-value pairs |
|
||||
|
||||
**Response:** `200 OK`
|
||||
|
||||
```json
|
||||
{
|
||||
"agent_id": "worker-01",
|
||||
"registry_token": "registry_a1b2c3d4..."
|
||||
}
|
||||
```
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/agents/register \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{
|
||||
"agent_id": "worker-01",
|
||||
"agent_type": "codex-cli",
|
||||
"hostname": "host-worker-01",
|
||||
"capabilities": ["code:rust"],
|
||||
"max_concurrency": 2
|
||||
}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Heartbeat
|
||||
|
||||
```
|
||||
POST /api/v1/agents/heartbeat
|
||||
```
|
||||
|
||||
**Request:**
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| agent_id | string | yes | Agent to update |
|
||||
|
||||
**Response:** `200 OK`
|
||||
|
||||
```json
|
||||
{
|
||||
"agent_id": "worker-01",
|
||||
"status": "online",
|
||||
"last_heartbeat_at": "2025-01-15T10:30:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
**Errors:** `404` if agent not found.
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/agents/heartbeat \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"agent_id": "worker-01"}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Deregister Agent
|
||||
|
||||
```
|
||||
POST /api/v1/agents/deregister
|
||||
```
|
||||
|
||||
Sets agent offline and requeues all its active tasks back to `created`.
|
||||
|
||||
**Request:**
|
||||
|
||||
| Field | Type | Required |
|
||||
|-------|------|----------|
|
||||
| agent_id | string | yes |
|
||||
|
||||
**Response:** `200 OK`
|
||||
|
||||
```json
|
||||
{
|
||||
"agent_id": "worker-01",
|
||||
"status": "offline",
|
||||
"requeued_tasks": 3
|
||||
}
|
||||
```
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/agents/deregister \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"agent_id": "worker-01"}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### List Agents
|
||||
|
||||
```
|
||||
GET /api/v1/agents
|
||||
```
|
||||
|
||||
**Query Parameters:**
|
||||
|
||||
| Param | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| capability | string | Filter by capability (e.g. `code:rust`) |
|
||||
| status | string | Filter: `online`, `offline`, `draining` |
|
||||
|
||||
**Response:** `200 OK` — JSON array of [Agent](#agent-object) objects.
|
||||
|
||||
```bash
|
||||
curl 'http://localhost:9090/api/v1/agents?status=online'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### List Tasks
|
||||
|
||||
```
|
||||
GET /api/v1/tasks
|
||||
```
|
||||
|
||||
**Query Parameters:**
|
||||
|
||||
| Param | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| status | string | Filter by status (e.g. `created`, `running`, `failed`) |
|
||||
| agent_id | string | Filter by assigned agent |
|
||||
|
||||
**Response:** `200 OK` — JSON array of [Task](#task-object) objects. Ordered by `created_at` descending.
|
||||
|
||||
```bash
|
||||
curl 'http://localhost:9090/api/v1/tasks?status=running'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Get Task
|
||||
|
||||
```
|
||||
GET /api/v1/tasks/{task_id}
|
||||
```
|
||||
|
||||
**Response:** `200 OK` — single [Task](#task-object) object.
|
||||
|
||||
**Errors:** `404` if task not found.
|
||||
|
||||
```bash
|
||||
curl http://localhost:9090/api/v1/tasks/org%2Frepo%2342
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Dequeue Task (http_pull only)
|
||||
|
||||
```
|
||||
POST /api/v1/tasks/dequeue
|
||||
```
|
||||
|
||||
Requires Bearer token if `http_pull_token` is configured. Only returns tasks with `execution_mode = http_pull`.
|
||||
|
||||
**Request:**
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| agent_id | string | yes | Agent claiming the task |
|
||||
| capabilities | string[] | no | Capabilities to match against task labels |
|
||||
|
||||
**Response:** `200 OK` with [Task](#task-object) object, or `204 No Content` if no matching task.
|
||||
|
||||
**Errors:** `401` if token required and missing/invalid.
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/tasks/dequeue \
|
||||
-H 'Content-Type: application/json' \
|
||||
-H 'Authorization: Bearer my-token' \
|
||||
-d '{"agent_id": "worker-03", "capabilities": ["code:rust"]}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Update Task Status (http_pull only)
|
||||
|
||||
```
|
||||
POST /api/v1/tasks/{task_id}/status
|
||||
```
|
||||
|
||||
Requires Bearer token. Only works for tasks with `execution_mode = http_pull`.
|
||||
|
||||
**Request:**
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| status | string | yes | Target status: `running`, `review_pending`, etc. |
|
||||
|
||||
**Response:** `200 OK` — updated [Task](#task-object).
|
||||
|
||||
**Errors:** `400` if task is not `http_pull` mode or transition is invalid. `404` if task not found.
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/status \
|
||||
-H 'Content-Type: application/json' \
|
||||
-H 'Authorization: Bearer my-token' \
|
||||
-d '{"status": "running"}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Complete Task
|
||||
|
||||
```
|
||||
POST /api/v1/tasks/{task_id}/complete
|
||||
```
|
||||
|
||||
Works for both `ssh_cli` and `http_pull` tasks. Submit a receipt to mark the task done.
|
||||
|
||||
**Request:** A [Receipt](#receipt-object) object.
|
||||
|
||||
**Response:** `200 OK`
|
||||
|
||||
```json
|
||||
{
|
||||
"task_id": "org/repo#42",
|
||||
"status": "completed"
|
||||
}
|
||||
```
|
||||
|
||||
**Errors:** `404` if task not found. `400` if task is not in a completable state.
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/complete \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{
|
||||
"task_id": "org/repo#42",
|
||||
"agent_id": "worker-01",
|
||||
"status": "completed",
|
||||
"duration_seconds": 120,
|
||||
"summary": "Implemented feature X",
|
||||
"artifacts": [
|
||||
{"artifact_type": "pr", "url": "https://git.example/org/repo/pulls/7"}
|
||||
],
|
||||
"error": null
|
||||
}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Retry Task
|
||||
|
||||
```
|
||||
POST /api/v1/tasks/{task_id}/retry
|
||||
```
|
||||
|
||||
Retry a `failed` or `agent_lost` task. Transitions it back to `assigned`.
|
||||
|
||||
**Response:** `200 OK` — updated [Task](#task-object).
|
||||
|
||||
**Errors:** `400` if task status is not `failed` or `agent_lost`. `404` if task not found.
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/retry
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Submit Receipt
|
||||
|
||||
```
|
||||
POST /api/v1/receipts
|
||||
```
|
||||
|
||||
Submit a receipt for a task. Validates artifacts (e.g. checks PR exists via Forgejo API).
|
||||
|
||||
**Request:** A [Receipt](#receipt-object) object.
|
||||
|
||||
**Response:** `200 OK`
|
||||
|
||||
**Errors:** `404` if task not found. `400` if validation fails.
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/receipts \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{
|
||||
"task_id": "org/repo#42",
|
||||
"agent_id": "worker-01",
|
||||
"status": "completed",
|
||||
"duration_seconds": 95,
|
||||
"summary": "Fixed the bug",
|
||||
"artifacts": [],
|
||||
"error": null
|
||||
}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Forgejo Webhook
|
||||
|
||||
```
|
||||
POST /api/v1/webhooks/forgejo
|
||||
```
|
||||
|
||||
Receives Forgejo webhook events. Requires HMAC-SHA256 signature header.
|
||||
|
||||
**Headers:** `X-Forgejo-Event` or `X-Gitea-Event` determines the event type.
|
||||
|
||||
**Supported events:**
|
||||
|
||||
| Event | Action |
|
||||
|-------|--------|
|
||||
| `issues` (opened) | Creates a task from the Issue (requires `agent:*` label) |
|
||||
| `pull_request` (opened) | Sets task to `review_pending` (branch name → task_id) |
|
||||
| `pull_request` (merged/closed with `merged: true`) | Sets task to `completed`, auto-generates receipt |
|
||||
| `push` (to `task/*` branch) | Updates `last_activity_at` on the task |
|
||||
|
||||
**Response:** `200 OK`
|
||||
|
||||
```json
|
||||
{
|
||||
"accepted": true,
|
||||
"task_id": "org/repo#42"
|
||||
}
|
||||
```
|
||||
|
||||
**Errors:** `401` if signature invalid. `400` if payload unparseable.
|
||||
|
||||
---
|
||||
|
||||
## Object Schemas
|
||||
|
||||
### Agent Object
|
||||
|
||||
```json
|
||||
{
|
||||
"agent_id": "worker-01",
|
||||
"agent_type": "codex-cli",
|
||||
"hostname": "host-worker-01",
|
||||
"capabilities": ["code:rust"],
|
||||
"max_concurrency": 2,
|
||||
"current_tasks": 1,
|
||||
"status": "online",
|
||||
"last_heartbeat_at": "2025-01-15T10:30:00Z",
|
||||
"registered_at": "2025-01-15T09:00:00Z",
|
||||
"metadata": {}
|
||||
}
|
||||
```
|
||||
|
||||
### Task Object
|
||||
|
||||
```json
|
||||
{
|
||||
"task_id": "org/repo#42",
|
||||
"source": "forgejo:org/repo#42",
|
||||
"task_type": "code",
|
||||
"priority": "normal",
|
||||
"status": "created",
|
||||
"execution_mode": "ssh_cli",
|
||||
"assigned_agent_id": null,
|
||||
"assigned_host": null,
|
||||
"requirements": "Implement the feature described in the issue body",
|
||||
"labels": ["agent:code", "code:rust"],
|
||||
"branch_name": "task/org%2Frepo%2342",
|
||||
"pr_title": "feat: Implement feature (#42)",
|
||||
"created_at": "2025-01-15T10:00:00Z",
|
||||
"assigned_at": null,
|
||||
"started_at": null,
|
||||
"completed_at": null,
|
||||
"last_activity_at": null,
|
||||
"retry_count": 0,
|
||||
"max_retries": 2,
|
||||
"review_count": 0,
|
||||
"timeout_seconds": 1800
|
||||
}
|
||||
```
|
||||
|
||||
**Status values:** `created`, `assigned`, `running`, `review_pending`, `completed`, `failed`, `agent_lost`, `cancelled`
|
||||
|
||||
**Priority values:** `low`, `normal`, `high`, `urgent`
|
||||
|
||||
**Execution mode values:** `ssh_cli`, `http_pull`
|
||||
|
||||
### Receipt Object
|
||||
|
||||
```json
|
||||
{
|
||||
"task_id": "org/repo#42",
|
||||
"agent_id": "worker-01",
|
||||
"status": "completed",
|
||||
"duration_seconds": 120,
|
||||
"summary": "Implemented the feature",
|
||||
"artifacts": [
|
||||
{"artifact_type": "pr", "url": "https://git.example/org/repo/pulls/7", "path": null, "description": null}
|
||||
],
|
||||
"error": null
|
||||
}
|
||||
```
|
||||
|
||||
**Receipt status values:** `completed`, `failed`, `partial`
|
||||
|
||||
**Artifact type values:** `pr`, `commit`, `file`, `comment`, `url`
|
||||
272
docs/agent-onboarding-guide.md
Normal file
272
docs/agent-onboarding-guide.md
Normal file
|
|
@ -0,0 +1,272 @@
|
|||
# Agent Fleet — Agent Onboarding Guide
|
||||
|
||||
This guide explains how to integrate an agent with the Agent Fleet Orchestrator.
|
||||
|
||||
---
|
||||
|
||||
## Execution Modes
|
||||
|
||||
Agent Fleet supports two execution modes. The mode is set per-task at creation time (defaults to `ssh_cli`).
|
||||
|
||||
| Aspect | `ssh_cli` | `http_pull` |
|
||||
|--------|-----------|-------------|
|
||||
| Who initiates? | Orchestrator (via SSH or local subprocess) | Agent (via HTTP API) |
|
||||
| Control flow | Orchestrator builds prompt, runs CLI, collects output | Agent decides when to dequeue and execute |
|
||||
| Agent requirements | CLI binary on a configured host | HTTP client, can call REST API |
|
||||
| Auth needed? | No (Orchestrator manages) | Yes (Bearer token) |
|
||||
| Best for | Codex CLI, Claude Code, OpenCode — agents with CLIs | OpenClaw/Jeeves, Hermes — agents with their own schedulers |
|
||||
| Task creation trigger | Forgejo Issue webhook (default) | Same, or API call |
|
||||
|
||||
---
|
||||
|
||||
## ssh_cli Workflow
|
||||
|
||||
### 1. Configure a Host
|
||||
|
||||
Add a `[[hosts]]` section to `config.toml` on the Orchestrator:
|
||||
|
||||
```toml
|
||||
[[hosts]]
|
||||
host_id = "host-worker-01"
|
||||
hostname = "192.168.1.100"
|
||||
ssh_user = "deploy"
|
||||
ssh_port = 22
|
||||
ssh_key_path = "/home/deploy/.ssh/id_ed25519"
|
||||
work_dir = "/opt/agent-workspace"
|
||||
agents = [
|
||||
{ agent_type = "codex-cli", max_concurrency = 2, capabilities = ["code:rust", "code:python"] },
|
||||
]
|
||||
```
|
||||
|
||||
For local execution (same machine as Orchestrator), use `hostname = "localhost"` — the Orchestrator uses a local subprocess instead of SSH.
|
||||
|
||||
### 2. Install the Agent CLI
|
||||
|
||||
The CLI binary must be available on the target host in `$PATH`. The Orchestrator checks availability with `which <binary>`.
|
||||
|
||||
Built-in CLI templates:
|
||||
|
||||
| Agent Type | CLI Command |
|
||||
|------------|-------------|
|
||||
| `codex-cli` | `codex exec --json '{prompt}'` |
|
||||
| `claude-code` | `claude -p '{prompt}' --output-format json --dangerously-skip-permissions` |
|
||||
|
||||
Custom templates can be defined in `config.toml` under `[adapters]`.
|
||||
|
||||
### 3. Orchestrator Handles Everything
|
||||
|
||||
When a Forgejo Issue with an `agent:*` label arrives:
|
||||
|
||||
1. Orchestrator creates a task (`execution_mode = ssh_cli`)
|
||||
2. Dispatch loop picks the task, selects a host by capability + load
|
||||
3. SSH (or local subprocess) executes the CLI with a structured prompt
|
||||
4. Output is parsed (Codex JSON or Claude JSON format)
|
||||
5. Task status updates: `created` → `assigned` → `running` → `completed` (or `failed`)
|
||||
|
||||
### 4. What the Agent Receives (Structured Prompt)
|
||||
|
||||
The Orchestrator constructs this prompt and passes it as the `{prompt}` variable:
|
||||
|
||||
```
|
||||
Task ID: org/repo#42
|
||||
Type: code
|
||||
Goal:
|
||||
Implement the feature described in the issue body
|
||||
|
||||
Constraints:
|
||||
- Execution mode: ssh_cli
|
||||
- Labels: code:rust
|
||||
- Branch: task/org%2Frepo%2342
|
||||
- Expected output: JSON receipt
|
||||
|
||||
Validation:
|
||||
- Run relevant tests if code changed
|
||||
- Summarize changes and artifacts
|
||||
```
|
||||
|
||||
### 5. Expected CLI Output
|
||||
|
||||
The CLI must output JSON to stdout. The format depends on the parser:
|
||||
|
||||
**Codex JSON:**
|
||||
```json
|
||||
{"status": "completed", "summary": "done", "duration_seconds": 120, "artifacts": [{"artifact_type": "pr", "url": "https://..."}]}
|
||||
```
|
||||
|
||||
**Claude JSON:**
|
||||
```json
|
||||
{"status": "completed", "summary": "done", "duration_seconds": 95, "error": null}
|
||||
```
|
||||
|
||||
If output is not valid JSON, the task is marked `failed`.
|
||||
|
||||
---
|
||||
|
||||
## http_pull Workflow
|
||||
|
||||
### 1. Register
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/agents/register \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"agent_id": "worker-03", "agent_type": "openclaw", "hostname": "arm0", "capabilities": ["code:rust"], "max_concurrency": 2}'
|
||||
```
|
||||
|
||||
Response contains a `registry_token`. Keep it for subsequent API calls (if `http_pull_token` is configured, use that shared token instead).
|
||||
|
||||
### 2. Heartbeat (periodic)
|
||||
|
||||
Send a heartbeat every N seconds (default interval: 60s). If the Orchestrator doesn't receive one within `heartbeat_interval_secs × heartbeat_timeout_threshold`, the agent is marked offline and its tasks are requeued.
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/agents/heartbeat \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"agent_id": "worker-03"}'
|
||||
```
|
||||
|
||||
### 3. Dequeue a Task
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/tasks/dequeue \
|
||||
-H 'Content-Type: application/json' \
|
||||
-H 'Authorization: Bearer <token>' \
|
||||
-d '{"agent_id": "worker-03", "capabilities": ["code:rust"]}'
|
||||
```
|
||||
|
||||
Returns `200 OK` with a Task object, or `204 No Content` if nothing available.
|
||||
|
||||
Only tasks with `execution_mode = http_pull` are returned.
|
||||
|
||||
### 4. Update Status While Working
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/status \
|
||||
-H 'Content-Type: application/json' \
|
||||
-H 'Authorization: Bearer <token>' \
|
||||
-d '{"status": "running"}'
|
||||
```
|
||||
|
||||
### 5. Complete the Task
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/complete \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{
|
||||
"task_id": "org/repo#42",
|
||||
"agent_id": "worker-03",
|
||||
"status": "completed",
|
||||
"duration_seconds": 180,
|
||||
"summary": "Fixed the issue",
|
||||
"artifacts": [{"artifact_type": "pr", "url": "https://git.example/org/repo/pulls/15"}],
|
||||
"error": null
|
||||
}'
|
||||
```
|
||||
|
||||
Or use the receipts endpoint:
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/receipts \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '<same receipt body>'
|
||||
```
|
||||
|
||||
### 6. Deregister When Done
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/agents/deregister \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"agent_id": "worker-03"}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Forgejo Integration
|
||||
|
||||
### How Issues Become Tasks
|
||||
|
||||
1. A Forgejo Issue is opened with a label matching `agent:*` (e.g. `agent:code`)
|
||||
2. Forgejo sends an `issues` webhook to `POST /api/v1/webhooks/forgejo`
|
||||
3. The `agent:*` label value becomes `task_type` (e.g. `code`)
|
||||
4. Priority is inferred from labels: `priority:urgent`, `priority:high`, `priority:low` (default: `normal`)
|
||||
5. A task is created with:
|
||||
- `task_id` = `{repo_full_name}#{issue_number}` (e.g. `org/repo#42`)
|
||||
- `execution_mode` = `ssh_cli` (default for Forgejo-originated tasks)
|
||||
- `branch_name` = `task/{url_encoded_task_id}` (e.g. `task/org%2Frepo%2342`)
|
||||
- `pr_title` = `feat: {issue_title} (#{issue_number})`
|
||||
|
||||
### Branch Naming Convention
|
||||
|
||||
- Branch: `task/{url_encoded_task_id}`
|
||||
- Example: task `org/repo#42` → branch `task/org%2Frepo%2342`
|
||||
|
||||
### PR Lifecycle
|
||||
|
||||
| Event | Effect |
|
||||
|-------|--------|
|
||||
| PR opened (branch = `task/*`) | Task → `review_pending` |
|
||||
| PR merged | Task → `completed`, auto receipt generated |
|
||||
| Push to `task/*` branch | Task `last_activity_at` updated |
|
||||
|
||||
### Task Status Flow
|
||||
|
||||
```
|
||||
created → assigned → running → review_pending → completed
|
||||
↘ failed
|
||||
↘ agent_lost
|
||||
↘ cancelled
|
||||
```
|
||||
|
||||
Any `failed` or `agent_lost` task can be retried via `POST /api/v1/tasks/{task_id}/retry` (transitions to `assigned`). Retry is limited by `max_retries` (default: 2).
|
||||
|
||||
---
|
||||
|
||||
## Structured Prompt Format (ssh_cli)
|
||||
|
||||
When the Orchestrator executes an agent via SSH, it constructs a structured prompt:
|
||||
|
||||
```
|
||||
Task ID: <task_id>
|
||||
Type: <task_type>
|
||||
Goal:
|
||||
<requirements>
|
||||
|
||||
Constraints:
|
||||
- Execution mode: ssh_cli
|
||||
- Labels: <comma-separated labels or <none>>
|
||||
- Branch: <branch_name>
|
||||
- Expected output: JSON receipt
|
||||
|
||||
Validation:
|
||||
- Run relevant tests if code changed
|
||||
- Summarize changes and artifacts
|
||||
```
|
||||
|
||||
The prompt is injected into the CLI template as the `{prompt}` variable. Other available variables: `{work_dir}`, `{task_id}`, `{branch}`.
|
||||
|
||||
---
|
||||
|
||||
## FAQ
|
||||
|
||||
**Q: How do I know which execution mode to use?**
|
||||
A: If you have a CLI binary and run on a configured host → `ssh_cli`. If you have your own scheduler or run outside configured hosts → `http_pull`.
|
||||
|
||||
**Q: Do I need to register for ssh_cli mode?**
|
||||
A: No. The Orchestrator manages ssh_cli tasks entirely. Registration is only for `http_pull` agents.
|
||||
|
||||
**Q: What happens if my agent crashes during ssh_cli execution?**
|
||||
A: The task is marked `failed`. If `retry_count < max_retries`, the dispatch loop will retry automatically.
|
||||
|
||||
**Q: What happens if my http_pull agent stops sending heartbeats?**
|
||||
A: After `heartbeat_interval_secs × heartbeat_timeout_threshold` seconds, the agent is marked offline and all its tasks are requeued with status `created`.
|
||||
|
||||
**Q: Can a task switch between execution modes?**
|
||||
A: No. The `execution_mode` is set at creation time and cannot be changed.
|
||||
|
||||
**Q: How do I create a task manually?**
|
||||
A: Use the Forgejo webhook flow (open an Issue with `agent:*` label), or directly insert into the database. There is no public "create task" API endpoint.
|
||||
|
||||
**Q: What label format triggers task creation?**
|
||||
A: Issues must have a label starting with `agent:` (e.g. `agent:code`, `agent:review`). The value after `agent:` becomes the task type. Issues without such a label are ignored.
|
||||
|
||||
**Q: How does the review loop work?**
|
||||
A: When a PR is opened (not merged), the task goes to `review_pending`. If the PR is not merged and the review cycle count exceeds `max_retries`, the task is marked `failed`. For `ssh_cli`, the Orchestrator re-dispatches automatically.
|
||||
2
openspec/changes/agent-onboarding-docs/.openspec.yaml
Normal file
2
openspec/changes/agent-onboarding-docs/.openspec.yaml
Normal file
|
|
@ -0,0 +1,2 @@
|
|||
schema: spec-driven
|
||||
created: 2026-05-12
|
||||
65
openspec/changes/agent-onboarding-docs/design.md
Normal file
65
openspec/changes/agent-onboarding-docs/design.md
Normal file
|
|
@ -0,0 +1,65 @@
|
|||
## Context
|
||||
|
||||
agent-fleet 核心功能已经实现并部署到 arm0 上运行。但没有任何 Agent 知道怎么用它。项目的可用性完全取决于 Agent 能否正确接入。
|
||||
|
||||
需要两个交付物:
|
||||
1. **API 参考文档**:给 Agent 看的 HTTP API 手册
|
||||
2. **通用 Skill**:遵循标准 skill 规范的能力描述,不绑定特定平台
|
||||
|
||||
关键约束:Skill 必须是平台无关的。承担 Team Leader 角色的不一定是 OpenClaw,Codex、Claude Code、OpenCode、Hermes Agent 都可能是调度者。
|
||||
|
||||
## Goals / Non-Goals
|
||||
|
||||
**Goals:**
|
||||
- 提供完整、准确、可直接使用的 API 参考文档
|
||||
- 提供通用 Skill,任何 Agent 加载后就知道如何与 agent-fleet 交互
|
||||
- 覆盖两种执行模式(ssh_cli + http_pull)的完整工作流
|
||||
- 覆盖 Forgejo 集成的 Git 工作流
|
||||
|
||||
**Non-Goals:**
|
||||
- 不写人类运维文档(部署、配置、排障)→ 这是另一个 change
|
||||
- 不写特定平台的集成脚本(如 OpenClaw skill 的安装脚本)
|
||||
- 不实现 SDK 或客户端库
|
||||
|
||||
## Decisions
|
||||
|
||||
### Decision 1: 通用 Skill 规范,不绑定平台
|
||||
|
||||
**选择**: Skill 使用标准 YAML frontmatter + Markdown body 格式
|
||||
|
||||
**理由**:
|
||||
- 所有主流 Agent 平台都支持这种格式(OpenClaw、Claude Code、Codex CLI、OpenCode)
|
||||
- 不包含任何平台特定语法,Agent 自行转换
|
||||
- curl 格式是通用语言,所有 Agent 都能理解
|
||||
|
||||
**替代方案**:
|
||||
- OpenClaw 专用 skill:限制了使用范围
|
||||
- 多平台各自写:重复劳动,容易不一致
|
||||
|
||||
### Decision 2: 文档放在 repo 内
|
||||
|
||||
**选择**: `docs/` 目录放 API 参考和接入指南,`skill/` 目录放 SKILL.md
|
||||
|
||||
**理由**:
|
||||
- 与代码同仓库,版本一致
|
||||
- Agent 可以通过 Forgejo 直接读取文档
|
||||
- Skill 可以被各平台 fork 或 symlink
|
||||
|
||||
### Decision 3: 文档从代码自动生成 + 手动补充
|
||||
|
||||
**选择**: API 端点列表手动维护(Phase 1),后续考虑从代码注释自动生成
|
||||
|
||||
**理由**:
|
||||
- Phase 1 端点数量有限(~12 个),手动维护成本低
|
||||
- 自动生成需要额外工具链(如 `utoipa`),Phase 1 不值得投入
|
||||
|
||||
## Risks / Trade-offs
|
||||
|
||||
- **[文档过时] 代码变更后文档可能不一致** → 文档与代码同仓库,PR review 时检查
|
||||
- **[Skill 通用性限制] 通用意味着不能利用平台特性** → 通用是正确选择,平台特定优化由各 Agent 自行处理
|
||||
|
||||
## Open Questions
|
||||
|
||||
_(resolved)_
|
||||
|
||||
- ~~Skill 是否需要包含多语言版本(中/英)?~~ → 全部使用英文。原因:LLM 训练语料以英文为主,英文更 token-efficient、语义歧义更小。Skill 的受众是 Agent 不是人类。
|
||||
37
openspec/changes/agent-onboarding-docs/proposal.md
Normal file
37
openspec/changes/agent-onboarding-docs/proposal.md
Normal file
|
|
@ -0,0 +1,37 @@
|
|||
## Why
|
||||
|
||||
agent-fleet 的所有核心功能(双执行模型、Forgejo 集成、Receipt 验证)已经实现并在 arm0 上跑通。但没有任何 Agent 知道如何使用它。
|
||||
|
||||
当前状态:
|
||||
- API 端点已经实现(注册、心跳、dequeue、status、receipt、webhook 等)
|
||||
- 双执行模式(ssh_cli + http_pull)已经实现
|
||||
- 但没有任何文档告诉 Agent "怎么接入、怎么调 API、怎么配合工作流"
|
||||
|
||||
项目的可用性完全取决于 Agent 能否正确接入。没有文档和 skill,agent-fleet 就是一个没人会用的 API。
|
||||
|
||||
同时,需要的是一个**通用 skill**(不绑定 OpenClaw),因为:
|
||||
- 承担 Team Leader 角色的不一定是 OpenClaw
|
||||
- Codex、Claude Code、OpenCode、Hermes Agent 等都需要能理解和使用 agent-fleet
|
||||
- Skill 是通用的 Agent 能力描述,遵循通用规范
|
||||
|
||||
## What Changes
|
||||
|
||||
- 新增 `docs/agent-api-reference.md`:完整的 HTTP API 参考文档,供任何 Agent 阅读
|
||||
- 新增 `docs/agent-onboarding-guide.md`:Agent 接入指南,包含两种模式的完整工作流程
|
||||
- 新增 `skill/` 目录:通用 Agent Skill 定义(SKILL.md),遵循通用 skill 规范
|
||||
- Skill 内容:API 调用方式、认证、任务生命周期、Forgejo 工作流、错误处理
|
||||
|
||||
## Capabilities
|
||||
|
||||
### New Capabilities
|
||||
- `agent-api-reference`: HTTP API 完整参考文档(端点、请求/响应格式、错误码、示例)
|
||||
- `agent-skill`: 通用 Agent Skill 定义,描述 Agent 如何与 agent-fleet 交互
|
||||
|
||||
### Modified Capabilities
|
||||
_(无)_
|
||||
|
||||
## Impact
|
||||
|
||||
- **文档**:新增 2 个 Markdown 文档 + 1 个 Skill 定义
|
||||
- **代码**:无代码变更
|
||||
- **项目**:Skill 目录是新增结构,可能需要考虑放在 repo 的哪个位置
|
||||
|
|
@ -0,0 +1,43 @@
|
|||
## ADDED Requirements
|
||||
|
||||
### Requirement: Complete HTTP API reference documentation
|
||||
项目 SHALL 提供完整的 HTTP API 参考文档(`docs/agent-api-reference.md`),供任何 Agent 阅读。文档 SHALL 覆盖所有公开端点,包含请求/响应格式、错误码、示例。
|
||||
|
||||
#### Scenario: Agent reads API reference to understand available endpoints
|
||||
- **WHEN** Agent 阅读 `docs/agent-api-reference.md`
|
||||
- **THEN** 文档 SHALL 列出所有端点:healthz、agents/register、agents/heartbeat、agents/deregister、agents (GET)、tasks (GET)、tasks/{id} (GET)、tasks/dequeue、tasks/{id}/status、tasks/{id}/retry、tasks/{id}/complete、receipts、webhooks/forgejo
|
||||
- **AND** 每个端点 SHALL 包含:HTTP 方法、URL、请求体格式、响应格式、错误码、curl 示例
|
||||
|
||||
#### Scenario: Agent checks authentication requirements
|
||||
- **WHEN** Agent 查看 API 参考的认证部分
|
||||
- **THEN** 文档 SHALL 说明:http_pull 模式需要 Bearer token(注册时获取),ssh_cli 模式不需要 Agent 认证,webhook 端点需要 HMAC-SHA256 签名
|
||||
|
||||
#### Scenario: Agent understands error responses
|
||||
- **WHEN** Agent 收到错误响应
|
||||
- **THEN** 文档 SHALL 列出所有错误码:401 Unauthorized、403 Forbidden、404 Not Found、400 Bad Request、500 Internal Server Error
|
||||
- **AND** 每个错误码 SHALL 包含触发场景描述
|
||||
|
||||
### Requirement: Agent onboarding guide
|
||||
项目 SHALL 提供 Agent 接入指南(`docs/agent-onboarding-guide.md`),描述两种执行模式的完整工作流程。
|
||||
|
||||
#### Scenario: New agent team leader reads onboarding guide
|
||||
- **WHEN** 新的 Team Leader Agent(如 Jeeves)阅读 onboarding guide
|
||||
- **THEN** 文档 SHALL 描述两种执行模式的区别和使用场景:
|
||||
- ssh_cli:Orchestrator 主动调度,适用于 Codex、Claude Code、OpenCode 等有 CLI 的 Agent
|
||||
- http_pull:Agent 自主拉取,适用于 OpenClaw/Jeeves、Hermes 等有自己的调度器的 Agent
|
||||
|
||||
#### Scenario: Agent follows ssh_cli workflow
|
||||
- **WHEN** Agent 按 ssh_cli 模式接入
|
||||
- **THEN** 文档 SHALL 描述完整流程:配置 host → Agent 安装 CLI → Orchestrator 自动发现 → 任务自动分配和执行 → PR 创建 → webhook 回调
|
||||
|
||||
#### Scenario: Agent follows http_pull workflow
|
||||
- **WHEN** Agent 按 http_pull 模式接入
|
||||
- **THEN** 文档 SHALL 描述完整流程:调用 register API → 获取 token → 定期 heartbeat → 调用 dequeue 拉任务 → 执行 → 调用 complete/receipt API
|
||||
|
||||
#### Scenario: Agent understands Forgejo integration
|
||||
- **WHEN** Agent 阅读 Forgejo 集成部分
|
||||
- **THEN** 文档 SHALL 描述:Issue 如何变成任务(webhook → label 解析)、任务如何关联 Git 分支(`task/{task_id}`)、PR 生命周期如何驱动状态更新(opened → review_pending、merged → completed)
|
||||
|
||||
#### Scenario: Agent understands structured prompt format
|
||||
- **WHEN** ssh_cli 模式的 Agent 需要理解传入的 prompt
|
||||
- **THEN** 文档 SHALL 描述结构化 prompt 的格式:Task ID、Type、Goal、Constraints、Branch、Expected output、Validation
|
||||
|
|
@ -0,0 +1,41 @@
|
|||
## ADDED Requirements
|
||||
|
||||
### Requirement: Universal Agent Skill definition
|
||||
项目 SHALL 提供一个通用 Agent Skill(`skill/SKILL.md`),遵循标准 skill 规范(YAML frontmatter + Markdown body)。Skill SHALL 不绑定任何特定 Agent 平台(OpenClaw、Claude Code、Codex、OpenCode、Hermes 等均可使用)。
|
||||
|
||||
#### Scenario: Any agent discovers and loads the skill
|
||||
- **WHEN** 任意 Agent(Codex、Claude Code、OpenCode、Hermes 等)加载 skill/SKILL.md
|
||||
- **THEN** Skill SHALL 包含 YAML frontmatter:`name: agent-fleet-integration`,`description` 描述用途和触发条件
|
||||
- **AND** Skill body SHALL 使用标准 Markdown 格式(标题、代码块、示例)
|
||||
|
||||
#### Scenario: Skill teaches agent how to interact with agent-fleet
|
||||
- **WHEN** Agent 阅读 Skill 内容
|
||||
- **THEN** Skill SHALL 包含 Quick Start 部分(最简单的接入示例,3 步以内)
|
||||
- **AND** 包含 Instructions 部分(详细的 API 调用流程)
|
||||
- **AND** 包含 Examples 部分(每种操作的 curl 示例)
|
||||
- **AND** 包含 Guidelines 部分(错误处理、重试策略、认证规则)
|
||||
|
||||
#### Scenario: Skill covers both execution modes
|
||||
- **WHEN** Agent 需要选择执行模式
|
||||
- **THEN** Skill SHALL 清晰说明 ssh_cli 和 http_pull 的区别
|
||||
- **AND** 指导 Agent 如何判断自己应该使用哪种模式:
|
||||
- 如果有 CLI 且在配置的主机上 → ssh_cli(由 Orchestrator 调度)
|
||||
- 如果有自己的调度器或不在配置的主机上 → http_pull(自主拉取)
|
||||
|
||||
#### Scenario: Skill includes Forgejo workflow
|
||||
- **WHEN** Agent 需要理解 Git 工作流
|
||||
- **THEN** Skill SHALL 描述分支命名约定(`task/{task_id}`)、PR 创建流程、webhook 触发机制
|
||||
|
||||
#### Scenario: Skill includes error recovery guidance
|
||||
- **WHEN** Agent 遇到 API 错误
|
||||
- **THEN** Skill SHALL 提供常见错误的处理方式:
|
||||
- 401 → 检查 token,必要时重新注册
|
||||
- 404 → 任务可能已完成或不存在
|
||||
- 409/400 → 检查任务状态是否允许该操作
|
||||
- 网络错误 → 重试(指数退避)
|
||||
|
||||
#### Scenario: Skill is portable across agent platforms
|
||||
- **WHEN** Skill 被不同平台的 Agent 使用
|
||||
- **THEN** Skill SHALL 不包含任何平台特定的语法或指令(如 OpenClaw 的 `sessions_send`、Claude Code 的 `hooks` 等)
|
||||
- **AND** 所有交互通过标准 HTTP 请求描述(curl 格式)
|
||||
- **AND** Agent 可根据自身能力将 curl 转换为对应的 HTTP 调用方式
|
||||
36
openspec/changes/agent-onboarding-docs/tasks.md
Normal file
36
openspec/changes/agent-onboarding-docs/tasks.md
Normal file
|
|
@ -0,0 +1,36 @@
|
|||
## 1. API 参考文档
|
||||
|
||||
- [ ] 1.1 创建 `docs/agent-api-reference.md`
|
||||
- [ ] 1.2 列出所有公开端点(~12 个),每个包含:HTTP 方法、URL、请求体、响应体、错误码、curl 示例
|
||||
- [ ] 1.3 认证部分:http_pull token、webhook HMAC-SHA256 签名
|
||||
- [ ] 1.4 错误码汇总:401/403/404/400/500,每个附触发场景
|
||||
- [ ] 1.5 通用说明:base_url、Content-Type、字符编码、分页(如有)
|
||||
|
||||
## 2. Agent 接入指南
|
||||
|
||||
- [ ] 2.1 创建 `docs/agent-onboarding-guide.md`
|
||||
- [ ] 2.2 两种执行模式对比表(ssh_cli vs http_pull)
|
||||
- [ ] 2.3 ssh_cli 模式完整工作流:配置 host → CLI 安装 → 自动调度 → PR 工作流
|
||||
- [ ] 2.4 http_pull 模式完整工作流:register → heartbeat → dequeue → execute → complete/receipt
|
||||
- [ ] 2.5 Forgejo 集成说明:Issue → Task、分支命名、PR 生命周期
|
||||
- [ ] 2.6 结构化 prompt 格式说明(ssh_cli 模式下 Agent 收到的 prompt 结构)
|
||||
- [ ] 2.7 常见问题 FAQ
|
||||
|
||||
## 3. 通用 Agent Skill
|
||||
|
||||
- [ ] 3.1 创建 `skill/SKILL.md`(YAML frontmatter + Markdown body)
|
||||
- [ ] 3.2 Quick Start:最简接入示例(3 步以内)
|
||||
- [ ] 3.3 Instructions:详细 API 调用流程(register → heartbeat → dequeue → execute → complete)
|
||||
- [ ] 3.4 Examples:每种操作的 curl 示例
|
||||
- [ ] 3.5 Guidelines:错误处理、重试策略、认证规则
|
||||
- [ ] 3.6 执行模式选择指南:Agent 如何判断自己用 ssh_cli 还是 http_pull
|
||||
- [ ] 3.7 Forgejo 工作流说明(分支命名、PR 创建、webhook 触发)
|
||||
- [ ] 3.8 验证:Skill 内容与 API 参考文档一致、curl 示例可执行
|
||||
|
||||
## 4. 验证
|
||||
|
||||
- [ ] 4.1 API 参考文档覆盖所有已实现端点
|
||||
- [ ] 4.2 curl 示例基于 arm0 实例可执行
|
||||
- [ ] 4.3 Skill 格式符合标准规范(YAML frontmatter + Markdown body)
|
||||
- [ ] 4.4 Skill 不包含任何平台特定语法
|
||||
- [ ] 4.5 接入指南与当前代码实现一致
|
||||
281
skill/SKILL.md
Normal file
281
skill/SKILL.md
Normal file
|
|
@ -0,0 +1,281 @@
|
|||
---
|
||||
name: agent-fleet-integration
|
||||
description: |
|
||||
Interact with the Agent Fleet Orchestrator. Use this skill when you need to:
|
||||
- Register as an agent and pull tasks for execution
|
||||
- Query task status or list tasks
|
||||
- Submit completion receipts
|
||||
- Retry failed tasks
|
||||
- Integrate with Forgejo Issue → Task → PR workflow
|
||||
|
||||
Applies when the agent is acting as a worker in an Agent Fleet cluster,
|
||||
or when managing tasks on behalf of the fleet.
|
||||
---
|
||||
|
||||
# Agent Fleet Integration Skill
|
||||
|
||||
## Quick Start (http_pull mode)
|
||||
|
||||
**Step 1.** Register your agent:
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/agents/register \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"agent_id":"my-agent","agent_type":"openclaw","hostname":"myhost","capabilities":["code:rust"],"max_concurrency":2}'
|
||||
```
|
||||
|
||||
**Step 2.** Pull and execute a task:
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/tasks/dequeue \
|
||||
-H 'Content-Type: application/json' \
|
||||
-H 'Authorization: Bearer <token>' \
|
||||
-d '{"agent_id":"my-agent","capabilities":["code:rust"]}'
|
||||
```
|
||||
|
||||
**Step 3.** Submit your result:
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/tasks/<task_id>/complete \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"task_id":"<task_id>","agent_id":"my-agent","status":"completed","duration_seconds":60,"summary":"done","artifacts":[],"error":null}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Choosing Your Execution Mode
|
||||
|
||||
| If you... | Use this mode |
|
||||
|-----------|---------------|
|
||||
| Have a CLI binary installed on a configured host | `ssh_cli` — Orchestrator calls you |
|
||||
| Have your own scheduler or run outside configured hosts | `http_pull` — You call the API |
|
||||
|
||||
- `ssh_cli` agents do **not** need to call any API. The Orchestrator handles everything via SSH or local subprocess.
|
||||
- `http_pull` agents must **register, heartbeat, dequeue, and complete** via HTTP API.
|
||||
|
||||
---
|
||||
|
||||
## Instructions
|
||||
|
||||
### http_pull Agent Lifecycle
|
||||
|
||||
```
|
||||
Register → Heartbeat (loop) → Dequeue → Execute → Complete/Deregister
|
||||
```
|
||||
|
||||
1. **Register** once at startup via `POST /api/v1/agents/register`.
|
||||
2. **Heartbeat** periodically (every 60s recommended) via `POST /api/v1/agents/heartbeat`. Without heartbeats, you will be marked offline and your tasks requeued.
|
||||
3. **Dequeue** when ready for work via `POST /api/v1/tasks/dequeue`. Returns a Task or 204 No Content.
|
||||
4. **Update status** to `running` via `POST /api/v1/tasks/{task_id}/status`.
|
||||
5. **Complete** the task via `POST /api/v1/tasks/{task_id}/complete` with a Receipt.
|
||||
6. **Deregister** when shutting down via `POST /api/v1/agents/deregister`.
|
||||
|
||||
### ssh_cli Agent Notes
|
||||
|
||||
No API interaction required. Ensure:
|
||||
- Your CLI binary is in `$PATH` on the configured host.
|
||||
- Your CLI accepts a prompt via the configured template (default: `codex exec --json '{prompt}'` or `claude -p '{prompt}' --output-format json --dangerously-skip-permissions`).
|
||||
- Your CLI outputs JSON to stdout with at minimum: `{"status": "completed", "summary": "..."}`.
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
|
||||
### Register
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/agents/register \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{
|
||||
"agent_id": "worker-03",
|
||||
"agent_type": "openclaw",
|
||||
"hostname": "arm0",
|
||||
"capabilities": ["code:rust", "review"],
|
||||
"max_concurrency": 2,
|
||||
"metadata": {"version": "1.0"}
|
||||
}'
|
||||
```
|
||||
|
||||
### Heartbeat
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/agents/heartbeat \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"agent_id": "worker-03"}'
|
||||
```
|
||||
|
||||
### List Available Tasks
|
||||
|
||||
```bash
|
||||
curl 'http://localhost:9090/api/v1/tasks?status=created'
|
||||
```
|
||||
|
||||
### Dequeue
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/tasks/dequeue \
|
||||
-H 'Content-Type: application/json' \
|
||||
-H 'Authorization: Bearer my-token' \
|
||||
-d '{"agent_id": "worker-03", "capabilities": ["code:rust"]}'
|
||||
```
|
||||
|
||||
Returns 200 with Task JSON, or 204 if no matching task.
|
||||
|
||||
### Get Task Detail
|
||||
|
||||
```bash
|
||||
curl 'http://localhost:9090/api/v1/tasks/org%2Frepo%2342'
|
||||
```
|
||||
|
||||
### Update Task Status
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/status \
|
||||
-H 'Content-Type: application/json' \
|
||||
-H 'Authorization: Bearer my-token' \
|
||||
-d '{"status": "running"}'
|
||||
```
|
||||
|
||||
### Complete Task with Receipt
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/complete \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{
|
||||
"task_id": "org/repo#42",
|
||||
"agent_id": "worker-03",
|
||||
"status": "completed",
|
||||
"duration_seconds": 180,
|
||||
"summary": "Implemented the feature as described",
|
||||
"artifacts": [
|
||||
{"artifact_type": "pr", "url": "https://git.example/org/repo/pulls/15"}
|
||||
],
|
||||
"error": null
|
||||
}'
|
||||
```
|
||||
|
||||
### Submit Receipt
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/receipts \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{
|
||||
"task_id": "org/repo#42",
|
||||
"agent_id": "worker-03",
|
||||
"status": "completed",
|
||||
"duration_seconds": 180,
|
||||
"summary": "Done",
|
||||
"artifacts": [],
|
||||
"error": null
|
||||
}'
|
||||
```
|
||||
|
||||
### Retry a Failed Task
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/retry
|
||||
```
|
||||
|
||||
Only works for tasks in `failed` or `agent_lost` status.
|
||||
|
||||
### List Agents
|
||||
|
||||
```bash
|
||||
curl 'http://localhost:9090/api/v1/agents?status=online&capability=code:rust'
|
||||
```
|
||||
|
||||
### Deregister
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:9090/api/v1/agents/deregister \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"agent_id": "worker-03"}'
|
||||
```
|
||||
|
||||
### Health Check
|
||||
|
||||
```bash
|
||||
curl http://localhost:9090/healthz
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Guidelines
|
||||
|
||||
### Authentication
|
||||
|
||||
- **http_pull endpoints** (`dequeue`, `status update`): require `Authorization: Bearer <token>` if `http_pull_token` is configured. If not configured, no auth is needed.
|
||||
- **All other endpoints**: no authentication required.
|
||||
- **Webhook endpoint**: requires HMAC-SHA256 signature header.
|
||||
|
||||
### Error Handling
|
||||
|
||||
| Code | Meaning | Action |
|
||||
|------|---------|--------|
|
||||
| 401 | Unauthorized | Check your Bearer token. If expired, re-register to get a new one. |
|
||||
| 404 | Not Found | Task may have been completed or never existed. Move on. |
|
||||
| 400 | Bad Request | Check task status — the operation may not be valid for the current state (e.g. retrying a `running` task). |
|
||||
| 204 | No Content (dequeue) | No matching tasks available. Wait and retry. |
|
||||
| 500 | Server Error | Retry with exponential backoff. Report if persistent. |
|
||||
|
||||
### Retry Strategy
|
||||
|
||||
- Use exponential backoff for transient errors (network, 500s): 1s, 2s, 4s, 8s, max 30s.
|
||||
- Do not retry 400 errors — fix your request.
|
||||
- For 404 on dequeue: poll again after a reasonable interval (e.g. 10–30 seconds).
|
||||
- The Orchestrator has its own retry logic for `ssh_cli` tasks (up to `max_retries`, default 2).
|
||||
|
||||
### Task Status Flow
|
||||
|
||||
```
|
||||
created → assigned → running → review_pending → completed
|
||||
↘ failed
|
||||
↘ agent_lost
|
||||
↘ cancelled
|
||||
```
|
||||
|
||||
- `failed` and `agent_lost` tasks can be retried via the retry endpoint.
|
||||
- `review_pending` means a PR was opened and is awaiting merge/review.
|
||||
- `completed` and `cancelled` are terminal states.
|
||||
|
||||
### Heartbeat Requirements
|
||||
|
||||
- Send heartbeats at least every `heartbeat_interval_secs` (default: 60s).
|
||||
- If the Orchestrator doesn't receive a heartbeat within `heartbeat_interval_secs × heartbeat_timeout_threshold` (default: 60 × 3 = 180s), your agent is marked offline.
|
||||
- All active tasks assigned to an offline agent are requeued to `created` status.
|
||||
|
||||
---
|
||||
|
||||
## Forgejo Workflow
|
||||
|
||||
### Task Creation (Issue → Task)
|
||||
|
||||
1. Open a Forgejo Issue with a label `agent:<type>` (e.g. `agent:code`).
|
||||
2. The webhook creates a task with `task_id = {repo}#{issue_number}`.
|
||||
3. Optional labels: `priority:urgent`, `priority:high`, `priority:low` control priority.
|
||||
|
||||
### Branch Naming
|
||||
|
||||
- Branch: `task/{url_encoded_task_id}`
|
||||
- Example: `org/repo#42` → branch `task/org%2Frepo%2342`
|
||||
|
||||
### PR Workflow
|
||||
|
||||
1. Work on the `task/*` branch.
|
||||
2. Open a PR from that branch.
|
||||
3. Orchestrator receives `pull_request.opened` webhook → task goes to `review_pending`.
|
||||
4. Pushes to the branch update `last_activity_at`.
|
||||
5. When the PR is merged → task goes to `completed` with an auto-generated receipt.
|
||||
|
||||
### For http_pull Agents
|
||||
|
||||
After dequeuing a task, create the branch and PR yourself:
|
||||
|
||||
```bash
|
||||
git checkout -b task/org%2Frepo%2342
|
||||
# ... do the work ...
|
||||
git push origin task/org%2Frepo%2342
|
||||
# Create PR via Forgejo API
|
||||
# The webhook will update the task automatically
|
||||
```
|
||||
|
||||
### For ssh_cli Agents
|
||||
|
||||
The Orchestrator passes the branch name in the structured prompt. Create the branch, push, and open the PR as part of your CLI execution. The webhooks handle status updates.
|
||||
Loading…
Add table
Add a link
Reference in a new issue