agent-fleet/docs/agent-onboarding-guide.md
Zer4tul d1a746a8cb docs: add agent API reference, onboarding guide, and universal skill
- docs/agent-api-reference.md (473 lines): complete HTTP API reference for all 12 endpoints
- docs/agent-onboarding-guide.md (272 lines): ssh_cli and http_pull workflows, Forgejo integration
- skill/SKILL.md (281 lines): universal agent skill, platform-agnostic, curl-based examples

All content in English. No code changes.
2026-05-12 14:57:05 +08:00

8.8 KiB
Raw Blame History

Agent Fleet — Agent Onboarding Guide

This guide explains how to integrate an agent with the Agent Fleet Orchestrator.


Execution Modes

Agent Fleet supports two execution modes. The mode is set per-task at creation time (defaults to ssh_cli).

Aspect ssh_cli http_pull
Who initiates? Orchestrator (via SSH or local subprocess) Agent (via HTTP API)
Control flow Orchestrator builds prompt, runs CLI, collects output Agent decides when to dequeue and execute
Agent requirements CLI binary on a configured host HTTP client, can call REST API
Auth needed? No (Orchestrator manages) Yes (Bearer token)
Best for Codex CLI, Claude Code, OpenCode — agents with CLIs OpenClaw/Jeeves, Hermes — agents with their own schedulers
Task creation trigger Forgejo Issue webhook (default) Same, or API call

ssh_cli Workflow

1. Configure a Host

Add a [[hosts]] section to config.toml on the Orchestrator:

[[hosts]]
host_id = "host-worker-01"
hostname = "192.168.1.100"
ssh_user = "deploy"
ssh_port = 22
ssh_key_path = "/home/deploy/.ssh/id_ed25519"
work_dir = "/opt/agent-workspace"
agents = [
  { agent_type = "codex-cli", max_concurrency = 2, capabilities = ["code:rust", "code:python"] },
]

For local execution (same machine as Orchestrator), use hostname = "localhost" — the Orchestrator uses a local subprocess instead of SSH.

2. Install the Agent CLI

The CLI binary must be available on the target host in $PATH. The Orchestrator checks availability with which <binary>.

Built-in CLI templates:

Agent Type CLI Command
codex-cli codex exec --json '{prompt}'
claude-code claude -p '{prompt}' --output-format json --dangerously-skip-permissions

Custom templates can be defined in config.toml under [adapters].

3. Orchestrator Handles Everything

When a Forgejo Issue with an agent:* label arrives:

  1. Orchestrator creates a task (execution_mode = ssh_cli)
  2. Dispatch loop picks the task, selects a host by capability + load
  3. SSH (or local subprocess) executes the CLI with a structured prompt
  4. Output is parsed (Codex JSON or Claude JSON format)
  5. Task status updates: createdassignedrunningcompleted (or failed)

4. What the Agent Receives (Structured Prompt)

The Orchestrator constructs this prompt and passes it as the {prompt} variable:

Task ID: org/repo#42
Type: code
Goal:
Implement the feature described in the issue body

Constraints:
- Execution mode: ssh_cli
- Labels: code:rust
- Branch: task/org%2Frepo%2342
- Expected output: JSON receipt

Validation:
- Run relevant tests if code changed
- Summarize changes and artifacts

5. Expected CLI Output

The CLI must output JSON to stdout. The format depends on the parser:

Codex JSON:

{"status": "completed", "summary": "done", "duration_seconds": 120, "artifacts": [{"artifact_type": "pr", "url": "https://..."}]}

Claude JSON:

{"status": "completed", "summary": "done", "duration_seconds": 95, "error": null}

If output is not valid JSON, the task is marked failed.


http_pull Workflow

1. Register

curl -X POST http://localhost:9090/api/v1/agents/register \
  -H 'Content-Type: application/json' \
  -d '{"agent_id": "worker-03", "agent_type": "openclaw", "hostname": "arm0", "capabilities": ["code:rust"], "max_concurrency": 2}'

Response contains a registry_token. Keep it for subsequent API calls (if http_pull_token is configured, use that shared token instead).

2. Heartbeat (periodic)

Send a heartbeat every N seconds (default interval: 60s). If the Orchestrator doesn't receive one within heartbeat_interval_secs × heartbeat_timeout_threshold, the agent is marked offline and its tasks are requeued.

curl -X POST http://localhost:9090/api/v1/agents/heartbeat \
  -H 'Content-Type: application/json' \
  -d '{"agent_id": "worker-03"}'

3. Dequeue a Task

curl -X POST http://localhost:9090/api/v1/tasks/dequeue \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer <token>' \
  -d '{"agent_id": "worker-03", "capabilities": ["code:rust"]}'

Returns 200 OK with a Task object, or 204 No Content if nothing available.

Only tasks with execution_mode = http_pull are returned.

4. Update Status While Working

curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/status \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer <token>' \
  -d '{"status": "running"}'

5. Complete the Task

curl -X POST http://localhost:9090/api/v1/tasks/org%2Frepo%2342/complete \
  -H 'Content-Type: application/json' \
  -d '{
    "task_id": "org/repo#42",
    "agent_id": "worker-03",
    "status": "completed",
    "duration_seconds": 180,
    "summary": "Fixed the issue",
    "artifacts": [{"artifact_type": "pr", "url": "https://git.example/org/repo/pulls/15"}],
    "error": null
  }'

Or use the receipts endpoint:

curl -X POST http://localhost:9090/api/v1/receipts \
  -H 'Content-Type: application/json' \
  -d '<same receipt body>'

6. Deregister When Done

curl -X POST http://localhost:9090/api/v1/agents/deregister \
  -H 'Content-Type: application/json' \
  -d '{"agent_id": "worker-03"}'

Forgejo Integration

How Issues Become Tasks

  1. A Forgejo Issue is opened with a label matching agent:* (e.g. agent:code)
  2. Forgejo sends an issues webhook to POST /api/v1/webhooks/forgejo
  3. The agent:* label value becomes task_type (e.g. code)
  4. Priority is inferred from labels: priority:urgent, priority:high, priority:low (default: normal)
  5. A task is created with:
    • task_id = {repo_full_name}#{issue_number} (e.g. org/repo#42)
    • execution_mode = ssh_cli (default for Forgejo-originated tasks)
    • branch_name = task/{url_encoded_task_id} (e.g. task/org%2Frepo%2342)
    • pr_title = feat: {issue_title} (#{issue_number})

Branch Naming Convention

  • Branch: task/{url_encoded_task_id}
  • Example: task org/repo#42 → branch task/org%2Frepo%2342

PR Lifecycle

Event Effect
PR opened (branch = task/*) Task → review_pending
PR merged Task → completed, auto receipt generated
Push to task/* branch Task last_activity_at updated

Task Status Flow

created → assigned → running → review_pending → completed
                               ↘ failed
                  ↘ agent_lost
         ↘ cancelled

Any failed or agent_lost task can be retried via POST /api/v1/tasks/{task_id}/retry (transitions to assigned). Retry is limited by max_retries (default: 2).


Structured Prompt Format (ssh_cli)

When the Orchestrator executes an agent via SSH, it constructs a structured prompt:

Task ID: <task_id>
Type: <task_type>
Goal:
<requirements>

Constraints:
- Execution mode: ssh_cli
- Labels: <comma-separated labels or <none>>
- Branch: <branch_name>
- Expected output: JSON receipt

Validation:
- Run relevant tests if code changed
- Summarize changes and artifacts

The prompt is injected into the CLI template as the {prompt} variable. Other available variables: {work_dir}, {task_id}, {branch}.


FAQ

Q: How do I know which execution mode to use? A: If you have a CLI binary and run on a configured host → ssh_cli. If you have your own scheduler or run outside configured hosts → http_pull.

Q: Do I need to register for ssh_cli mode? A: No. The Orchestrator manages ssh_cli tasks entirely. Registration is only for http_pull agents.

Q: What happens if my agent crashes during ssh_cli execution? A: The task is marked failed. If retry_count < max_retries, the dispatch loop will retry automatically.

Q: What happens if my http_pull agent stops sending heartbeats? A: After heartbeat_interval_secs × heartbeat_timeout_threshold seconds, the agent is marked offline and all its tasks are requeued with status created.

Q: Can a task switch between execution modes? A: No. The execution_mode is set at creation time and cannot be changed.

Q: How do I create a task manually? A: Use the Forgejo webhook flow (open an Issue with agent:* label), or directly insert into the database. There is no public "create task" API endpoint.

Q: What label format triggers task creation? A: Issues must have a label starting with agent: (e.g. agent:code, agent:review). The value after agent: becomes the task type. Issues without such a label are ignored.

Q: How does the review loop work? A: When a PR is opened (not merged), the task goes to review_pending. If the PR is not merged and the review cycle count exceeds max_retries, the task is marked failed. For ssh_cli, the Orchestrator re-dispatches automatically.