agent-fleet/skill/SKILL.md
Zer4tul 1f351a1734 docs: replace localhost with FLEET_API_URL, add persistence and heartbeat guidance
- Replace all localhost:9090 with FLEET_API_URL:PORT placeholder
- Add Step 0: persist Fleet API URL to agent memory
- Clarify heartbeat must be periodic loop (60s interval)
- Add execution mode self-selection decision flow
- Add persisting configuration section (URL, agent_id, token)
2026-05-12 18:34:23 +08:00

354 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
name: agent-fleet-integration
description: |
Interact with the Agent Fleet Orchestrator. Use this skill when you need to:
- Register as an agent and pull tasks for execution
- Query task status or list tasks
- Submit completion receipts
- Retry failed tasks
- Integrate with Forgejo Issue → Task → PR workflow
Applies when the agent is acting as a worker in an Agent Fleet cluster,
or when managing tasks on behalf of the fleet.
---
# Agent Fleet Integration Skill
`FLEET_API_URL:PORT` means the address of your Agent Fleet Orchestrator (for example, `100.102.101.43:9090`). If you do not know it, ask your user for the Fleet API address before using these examples.
## Quick Start (http_pull mode)
**Step 0.** Get the Fleet API URL from your user and **save it to your persistent memory**:
- Ask: "What is the Agent Fleet Orchestrator URL?"
- Example answer: `http://100.102.101.43:9090`
- **IMPORTANT**: Write this URL to your persistent storage (memory file, config, etc.) so you remember it across sessions. If you lose this URL, you cannot reconnect to the fleet.
**Step 1.** Register your agent:
```bash
curl -X POST http://FLEET_API_URL:PORT/api/v1/agents/register \
-H 'Content-Type: application/json' \
-d '{"agent_id":"my-agent","agent_type":"openclaw","hostname":"myhost","capabilities":["code:rust"],"max_concurrency":2}'
```
After registration, save your `agent_id` and `registry_token` to persistent memory too, then start a heartbeat loop immediately. The heartbeat must run for the entire lifetime of the agent.
**Step 2.** Pull and execute a task:
```bash
curl -X POST http://FLEET_API_URL:PORT/api/v1/tasks/dequeue \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <token>' \
-d '{"agent_id":"my-agent","capabilities":["code:rust"]}'
```
**Step 3.** Submit your result:
```bash
curl -X POST http://FLEET_API_URL:PORT/api/v1/tasks/<task_id>/complete \
-H 'Content-Type: application/json' \
-d '{"task_id":"<task_id>","agent_id":"my-agent","status":"completed","duration_seconds":60,"summary":"done","artifacts":[],"error":null}'
```
---
## Choosing Your Execution Mode
Ask your user: **"What is the address of the Agent Fleet Orchestrator?"**
If you do not know the address, ask your user.
Then determine your mode:
1. If you have a CLI binary **and** the Fleet orchestrator can SSH into your machine:
→ Use `ssh_cli` (Orchestrator calls you via SSH)
2. If the Fleet orchestrator cannot reach your machine via SSH (for example, you are behind NAT, a firewall, or on a different network):
→ Use `http_pull` (you call the Fleet API)
3. If you have your own scheduler or task loop (for example, OpenClaw or Hermes Agent):
→ Use `http_pull` (you control when to pull tasks)
4. If unsure, try this test:
- Ask your user: **"Can the Fleet orchestrator SSH into this machine?"**
- If yes → `ssh_cli`
- If no → `http_pull`
Common scenarios:
- Codex / Claude Code / OpenCode on a server → `ssh_cli` (Fleet SSHs in and runs your CLI)
- OpenClaw / Hermes Agent / any bot framework → `http_pull` (you pull tasks via HTTP API)
- Agent on a laptop behind NAT → `http_pull` (Fleet cannot reach you)
- `ssh_cli` agents do **not** need to call any API. The Orchestrator handles everything via SSH or local subprocess.
- `http_pull` agents must **register, heartbeat, dequeue, and complete** via HTTP API.
---
### Self-Determination Decision Flow
As an agent, you can self-determine your mode by answering these questions:
```
1. Do I have a CLI binary that can execute tasks non-interactively?
└─ NO → Use http_pull
└─ YES → Continue
2. Can the Fleet orchestrator SSH into my machine?
└─ NO → Use http_pull
└─ YES → Continue
3. Do I have my own scheduler, worker loop, or bot runtime?
└─ YES → Use http_pull (you prefer to control your own task flow)
└─ NO → Continue
4. Do I want the Orchestrator to initiate execution remotely?
└─ YES → Use ssh_cli
└─ NO → Use http_pull
```
**Quick reference:** If you can be reached via SSH and don't have a custom scheduler, use `ssh_cli`. Otherwise, use `http_pull`.
---
## Instructions
### http_pull Agent Lifecycle
```
Register → Save credentials to memory → Start heartbeat loop (every 60s, runs forever)
→ Dequeue (poll when idle) → Execute → Complete → Dequeue again
→ On shutdown: Deregister
```
1. **Register** once at startup via `POST /api/v1/agents/register`.
2. **Save credentials to memory** immediately after registration: persist `FLEET_API_URL`, your `agent_id`, and your `registry_token` in long-term memory or a config file.
3. **Start a periodic heartbeat loop** via `POST /api/v1/agents/heartbeat`.
- Default interval: every 60 seconds
- Maximum allowed gap: `heartbeat_interval_secs × heartbeat_timeout_threshold` (default: 180 seconds)
- If you exceed that gap, you will be marked offline and your tasks will be requeued
- This must run for the entire lifetime of the agent, not just once
4. **Dequeue** when ready for work via `POST /api/v1/tasks/dequeue`. Returns a Task or 204 No Content.
5. **Update status** to `running` via `POST /api/v1/tasks/{task_id}/status`.
6. **Complete** the task via `POST /api/v1/tasks/{task_id}/complete` with a Receipt.
7. **Deregister** when shutting down via `POST /api/v1/agents/deregister`.
### ssh_cli Agent Notes
No API interaction required. Ensure:
- Your CLI binary is in `$PATH` on the configured host.
- Your CLI accepts a prompt via the configured template (default: `codex exec --json '{prompt}'` or `claude -p '{prompt}' --output-format json --dangerously-skip-permissions`).
- Your CLI outputs JSON to stdout with at minimum: `{"status": "completed", "summary": "..."}`.
---
## Examples
### Register
```bash
curl -X POST http://FLEET_API_URL:PORT/api/v1/agents/register \
-H 'Content-Type: application/json' \
-d '{
"agent_id": "worker-03",
"agent_type": "openclaw",
"hostname": "arm0",
"capabilities": ["code:rust", "review"],
"max_concurrency": 2,
"metadata": {"version": "1.0"}
}'
```
### Heartbeat
```bash
# Heartbeat must be sent periodically. Example using a shell loop:
# while true; do curl -X POST http://FLEET_API_URL:PORT/api/v1/agents/heartbeat -H 'Content-Type: application/json' -d '{"agent_id": "worker-03"}'; sleep 60; done
curl -X POST http://FLEET_API_URL:PORT/api/v1/agents/heartbeat \
-H 'Content-Type: application/json' \
-d '{"agent_id": "worker-03"}'
```
### List Available Tasks
```bash
curl 'http://FLEET_API_URL:PORT/api/v1/tasks?status=created'
```
### Dequeue
```bash
curl -X POST http://FLEET_API_URL:PORT/api/v1/tasks/dequeue \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer my-token' \
-d '{"agent_id": "worker-03", "capabilities": ["code:rust"]}'
```
Returns 200 with Task JSON, or 204 if no matching task.
### Get Task Detail
```bash
curl 'http://FLEET_API_URL:PORT/api/v1/tasks/org%2Frepo%2342'
```
### Update Task Status
```bash
curl -X POST http://FLEET_API_URL:PORT/api/v1/tasks/org%2Frepo%2342/status \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer my-token' \
-d '{"status": "running"}'
```
### Complete Task with Receipt
```bash
curl -X POST http://FLEET_API_URL:PORT/api/v1/tasks/org%2Frepo%2342/complete \
-H 'Content-Type: application/json' \
-d '{
"task_id": "org/repo#42",
"agent_id": "worker-03",
"status": "completed",
"duration_seconds": 180,
"summary": "Implemented the feature as described",
"artifacts": [
{"artifact_type": "pr", "url": "https://git.example/org/repo/pulls/15"}
],
"error": null
}'
```
### Submit Receipt
```bash
curl -X POST http://FLEET_API_URL:PORT/api/v1/receipts \
-H 'Content-Type: application/json' \
-d '{
"task_id": "org/repo#42",
"agent_id": "worker-03",
"status": "completed",
"duration_seconds": 180,
"summary": "Done",
"artifacts": [],
"error": null
}'
```
### Retry a Failed Task
```bash
curl -X POST http://FLEET_API_URL:PORT/api/v1/tasks/org%2Frepo%2342/retry
```
Only works for tasks in `failed` or `agent_lost` status.
### List Agents
```bash
curl 'http://FLEET_API_URL:PORT/api/v1/agents?status=online&capability=code:rust'
```
### Deregister
```bash
curl -X POST http://FLEET_API_URL:PORT/api/v1/agents/deregister \
-H 'Content-Type: application/json' \
-d '{"agent_id": "worker-03"}'
```
### Health Check
```bash
curl http://FLEET_API_URL:PORT/healthz
```
---
## Guidelines
### Authentication
- **http_pull endpoints** (`dequeue`, `status update`): require `Authorization: Bearer <token>` if `http_pull_token` is configured. If not configured, no auth is needed.
- **All other endpoints**: no authentication required.
- **Webhook endpoint**: requires HMAC-SHA256 signature header.
### Error Handling
| Code | Meaning | Action |
|------|---------|--------|
| 401 | Unauthorized | Check your Bearer token. If expired, re-register to get a new one. |
| 404 | Not Found | Task may have been completed or never existed. Move on. |
| 400 | Bad Request | Check task status — the operation may not be valid for the current state (e.g. retrying a `running` task). |
| 204 | No Content (dequeue) | No matching tasks available. Wait and retry. |
| 500 | Server Error | Retry with exponential backoff. Report if persistent. |
### Retry Strategy
- Use exponential backoff for transient errors (network, 500s): 1s, 2s, 4s, 8s, max 30s.
- Do not retry 400 errors — fix your request.
- For 404 on dequeue: poll again after a reasonable interval (e.g. 1030 seconds).
- The Orchestrator has its own retry logic for `ssh_cli` tasks (up to `max_retries`, default 2).
### Task Status Flow
```
created → assigned → running → review_pending → completed
↘ failed
↘ agent_lost
↘ cancelled
```
- `failed` and `agent_lost` tasks can be retried via the retry endpoint.
- `review_pending` means a PR was opened and is awaiting merge/review.
- `completed` and `cancelled` are terminal states.
### Heartbeat Requirements
- Send heartbeats in a background loop at least every `heartbeat_interval_secs` (default: 60s).
- A one-time heartbeat is not enough; the loop must run for the full lifetime of the agent.
- If the Orchestrator doesn't receive a heartbeat within `heartbeat_interval_secs × heartbeat_timeout_threshold` (default: 60 × 3 = 180s), your agent is marked offline.
- All active tasks assigned to an offline agent are requeued to `created` status.
### Persisting Configuration
After registration, you MUST persist these values to your long-term memory or config file:
1. **FLEET_API_URL** — the orchestrator address
2. **Your agent_id** — your registered agent ID
3. **Your registry_token** — the token returned by registration (needed for http_pull auth)
Without these, you cannot reconnect after a session restart.
---
## Forgejo Workflow
### Task Creation (Issue → Task)
1. Open a Forgejo Issue with a label `agent:<type>` (e.g. `agent:code`).
2. The webhook creates a task with `task_id = {repo}#{issue_number}`.
3. Optional labels: `priority:urgent`, `priority:high`, `priority:low` control priority.
### Branch Naming
- Branch: `task/{url_encoded_task_id}`
- Example: `org/repo#42` → branch `task/org%2Frepo%2342`
### PR Workflow
1. Work on the `task/*` branch.
2. Open a PR from that branch.
3. Orchestrator receives `pull_request.opened` webhook → task goes to `review_pending`.
4. Pushes to the branch update `last_activity_at`.
5. When the PR is merged → task goes to `completed` with an auto-generated receipt.
### For http_pull Agents
After dequeuing a task, create the branch and PR yourself:
```bash
git checkout -b task/org%2Frepo%2342
# ... do the work ...
git push origin task/org%2Frepo%2342
# Create PR via Forgejo API
# The webhook will update the task automatically
```
### For ssh_cli Agents
The Orchestrator passes the branch name in the structured prompt. Create the branch, push, and open the PR as part of your CLI execution. The webhooks handle status updates.