refactor: remove Matrix bot, make agent-fleet platform-agnostic API service

- Remove src/integrations/matrix/ (bot connection, command parsing, notification formatting)
- Remove matrix-sdk dependency from Cargo.toml
- Remove MatrixConfig from config.rs and [matrix] from config.example.toml
- Add GET /api/v1/tasks (list with status/agent_id filter)
- Add POST /api/v1/tasks/{task_id}/retry (Failed/AgentLost → Assigned)
- Add EventStore::list_tasks() with parameterized query
- 29/29 tests pass

Platform integration (Telegram, Matrix, Feishu) is Agent-side responsibility.
agent-fleet is now a pure HTTP API orchestration engine.
This commit is contained in:
Zer4tul 2026-05-12 10:59:19 +08:00
parent 6efca09018
commit 1bc7580ecc
15 changed files with 435 additions and 2367 deletions

1798
Cargo.lock generated

File diff suppressed because it is too large Load diff

View file

@ -2,7 +2,7 @@
name = "agent-fleet" name = "agent-fleet"
version = "0.1.0" version = "0.1.0"
edition = "2024" edition = "2024"
description = "Agent Fleet Platform - Multi-agent orchestration with Forgejo + Matrix" description = "Agent Fleet Platform - Multi-agent orchestration with Forgejo"
license = "MIT" license = "MIT"
[dependencies] [dependencies]
@ -22,13 +22,9 @@ rusqlite = { version = "0.32", features = ["bundled"] }
# Configuration # Configuration
toml = "0.8" toml = "0.8"
# HTTP client (for Forgejo API, Matrix API) # HTTP client (for Forgejo API)
reqwest = { version = "0.12", features = ["json"] } reqwest = { version = "0.12", features = ["json"] }
# Matrix SDK
matrix-sdk = "0.10"
ruma = { version = "0.12", features = ["client-api-c", "rand", "unstable-msc3061", "unstable-msc2448"] }
# Logging # Logging
tracing = "0.1" tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter", "json"] } tracing-subscriber = { version = "0.3", features = ["env-filter", "json"] }

View file

@ -7,12 +7,6 @@ url = "https://git.0x08.org"
token = "" # Forgejo API token token = "" # Forgejo API token
webhook_secret = "" # Webhook shared secret webhook_secret = "" # Webhook shared secret
[matrix]
homeserver_url = "https://matrix.0x08.org"
user_id = "@jeeves:0x08.org"
access_token = "" # Matrix bot access token
room_id = "" # Coordination room ID
[orchestrator] [orchestrator]
db_path = "data/agent-fleet.db" db_path = "data/agent-fleet.db"
heartbeat_interval_secs = 60 heartbeat_interval_secs = 60

View file

@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-05-11

View file

@ -0,0 +1,73 @@
## Context
agent-fleet Phase 1 实现了独立的 Matrix bot通过 matrix-sdk直接处理 ChatOps 命令和通知推送。这违反了架构职责分离原则agent-fleet 的角色是纯编排引擎,不应该知道聊天平台的存在。
正确的职责边界:
- **agent-fleet**:纯 HTTP API 服务管理任务队列、状态机、receipt 验证。返回结构化 JSON。
- **Agent 层**各自负责平台接入。OpenClaw/Jeeves 连接 Telegram/Matrix/FeishuHermes 连接 Telegram/MatrixClaude Code / Codex / OpenCode 通过 cc-connect 等工具连接。Agent 通过 HTTP API 调用 agent-fleet自行决定如何展示给人类。
## Goals / Non-Goals
**Goals:**
- agent-fleet 成为纯 HTTP API 服务,不持有任何聊天协议连接
- 新增 HTTP API 端点暴露任务查询和操作
- 移除 matrix-sdk 依赖,减小编译时间和二进制体积
- 删除所有不属于编排引擎职责的代码(通知格式化、命令解析)
**Non-Goals:**
- 不设计 webhook/SSE 推送机制Phase 2 考虑)
- 不实现 Agent 侧的展示逻辑(各 Agent 自行处理)
- 不关心 Agent 如何连接聊天平台Agent 自己的事)
## Decisions
### Decision 1: agent-fleet 是纯 API 服务,平台无关
**选择**: agent-fleet 不连接任何聊天平台,只暴露 HTTP API
**理由**:
- 编排引擎不应该绑定特定聊天平台
- 平台接入是 Agent 侧的职责OpenClaw、Hermes 自行连接Claude Code / Codex 通过 cc-connect 连接
- 用户可能同时使用 Telegram、Matrix、Feishuagent-fleet 不需要知道这些
- 减少 matrix-sdk 依赖(编译慢、体积大)
**替代方案**:
- 内置多平台支持Telegram bot + Matrix bot + Feishu bot严重过度设计
- 单平台 bot限制了平台选择
### Decision 2: 删除通知格式化函数
**选择**: 不在 agent-fleet 中保留任何通知格式化代码
**理由**:
- agent-fleet 返回结构化 JSON格式化为人类可读文本是 Agent 的事
- 不同平台有不同的展示能力Telegram 支持 MarkdownFeishu 支持卡片Matrix 支持-thread
- 保留格式化函数会诱导未来继续往 agent-fleet 里塞展示逻辑
### Decision 3: 新增任务查询和重试 API
**选择**: `GET /api/v1/tasks` + `POST /api/v1/tasks/{id}/retry`
**理由**:
- Agent 需要查询任务状态来展示给人类
- Agent 需要触发重试操作响应人类命令
- RESTful 风格,与已有 API 一致
## Risks / Trade-offs
- **[实时性] Agent 需要轮询获取状态变更** → Phase 1 可接受Phase 2 加 SSE 或 webhook
- **[重复劳动] 每个 Agent 都要写通知格式化** → 各 Agent 最了解自己平台的展示能力,这不算重复
## Migration Plan
1. 删除 `src/integrations/matrix/` 目录
2. 删除通知格式化代码
3. 新增 API 端点和 EventStore 查询方法
4. 清理 config 和依赖
5. 更新测试
**Rollback**: git history 可恢复。
## Open Questions
- Phase 2 是否需要 webhook/SSE 推送?还是 Agent heartbeat 轮询足够?

View file

@ -0,0 +1,38 @@
## Why
agent-fleet 当前实现了一个独立的 Matrix bot 连接matrix-sdk直接处理 ChatOps 命令和通知推送。这违反了架构职责分离原则:
1. **Session 冲突**`@jeeves:0x08.org` 已被 OpenClaw 占用
2. **架构冗余**agent-fleet 重新实现了消息收发、命令路由、通知推送
3. **违背最小化原则**agent-fleet 的核心价值是编排引擎任务队列、状态机、receipt 验证),不是聊天界面
4. **平台绑定**:硬编码 Matrix 协议,但 agent-fleet 作为纯 API 服务应该是平台无关的
agent-fleet 应该是一个纯 HTTP API 服务不知道也不关心聊天平台的存在。平台接入Telegram、Matrix、Feishu 等)完全是 Agent 侧的事情:
- OpenClaw/Jeeves 自己连接多个平台
- Hermes Agent 自己连接 Telegram 或 Matrix
- Claude Code / Codex / OpenCode 通过 cc-connect 等扩展工具连接平台
## What Changes
- **BREAKING**: 移除 `src/integrations/matrix/` 中的独立 Matrix bot 实现
- 移除通知格式化函数(不属于编排引擎的职责)
- 新增 `GET /api/v1/tasks` 端点(任务列表查询,支持过滤)
- 新增 `POST /api/v1/tasks/{id}/retry` 端点(任务重试)
- `config.toml` 移除 `[matrix]` section
- `Cargo.toml` 移除 `matrix-sdk` 依赖
## Capabilities
### New Capabilities
- `task-api-endpoints`: 任务查询和操作 HTTP API 端点(`GET /api/v1/tasks`, `POST /api/v1/tasks/{id}/retry`),返回结构化 JSON
### Modified Capabilities
- `matrix-chatops`: 移除独立 bot 连接和 ChatOps 功能。Orchestrator 只负责暴露 HTTP API不涉及任何聊天平台集成。本 capability 修订后实质上变为"Orchestrator API 暴露足够信息供外部 Agent 展示"。
## Impact
- **代码**:删除 `src/integrations/matrix/` 目录及相关引用
- **API**:新增 2 个端点,无破坏性变更
- **配置**`config.toml` 移除 `[matrix]` section
- **依赖**:从 `Cargo.toml` 移除 `matrix-sdk`
- **职责边界**agent-fleet 只返回结构化 JSON不关心如何展示给人类

View file

@ -0,0 +1,31 @@
## MODIFIED Requirements
### Requirement: Orchestrator exposes status via API only
Orchestrator SHALL 通过 HTTP API 暴露所有任务和 Agent 状态信息,不连接任何外部聊天或通知平台。平台接入由各 Agent 自行处理。
#### Scenario: Task state change queryable via API
- **WHEN** 任务 #42 状态变更assigned、completed、failed 等)
- **THEN** 变更 SHALL 通过 `GET /api/v1/tasks` API 可查询
- **AND** 变更 SHALL 记录在 task_events 表中
#### Scenario: Agent state change queryable via API
- **WHEN** Agent `worker-03` 状态变更online、offline 等)
- **THEN** 变更 SHALL 通过 `GET /api/v1/agents` API 可查询
## REMOVED Requirements
### Requirement: Matrix room as coordination channel
**Reason**: Orchestrator 不连接任何聊天平台。平台接入是 Agent 侧的职责。
**Migration**: 各 AgentOpenClaw/Jeeves、Hermes 等)自行连接聊天平台,通过 Orchestrator HTTP API 获取状态后展示。
### Requirement: Slash commands for orchestration
**Reason**: Orchestrator 不处理命令。命令解析和路由由各 Agent 自行处理。
**Migration**: 各 Agent 解析用户命令,调用 Orchestrator HTTP API 执行操作。
### Requirement: Matrix notifications for receipts
**Reason**: Orchestrator 不推送通知。通知由各 Agent 根据所在平台能力自行处理。
**Migration**: Agent 通过 API 轮询或 webhook 获取 receipt 状态变更,自行决定通知方式和格式。
### Requirement: Per-agent Matrix thread
**Reason**: Orchestrator 不了解任何聊天平台概念。展示方式由各 Agent 根据平台能力决定。
**Migration**: Agent 通过 `GET /api/v1/tasks?agent_id=xxx` 获取任务历史,自行选择展示方式。

View file

@ -0,0 +1,29 @@
## ADDED Requirements
### Requirement: Task list API endpoint
Orchestrator SHALL 提供 `GET /api/v1/tasks` 端点,返回任务列表的结构化 JSON。
#### Scenario: List all tasks
- **WHEN** 发送 `GET /api/v1/tasks`
- **THEN** SHALL 返回 JSON 数组每项包含task_id, source, task_type, priority, status, assigned_agent_id, retry_count, max_retries, created_at, assigned_at, started_at, completed_at
#### Scenario: Filter by status
- **WHEN** 发送 `GET /api/v1/tasks?status=running`
- **THEN** SHALL 仅返回 status 为 `running` 的任务
#### Scenario: Filter by agent
- **WHEN** 发送 `GET /api/v1/tasks?agent_id=worker-03`
- **THEN** SHALL 仅返回 assigned_agent_id 为 `worker-03` 的任务
### Requirement: Task retry API endpoint
Orchestrator SHALL 提供 `POST /api/v1/tasks/{task_id}/retry` 端点。
#### Scenario: Retry a failed task
- **WHEN** 发送 `POST /api/v1/tasks/org/repo#42/retry`
- **AND** 任务当前状态为 `failed``agent_lost`
- **THEN** SHALL 将任务状态转换为 `assigned`,返回更新后的任务 JSON
#### Scenario: Retry a non-retryable task
- **WHEN** 发送 `POST /api/v1/tasks/org/repo#42/retry`
- **AND** 任务当前状态不是 `failed``agent_lost`
- **THEN** SHALL 返回 400 错误,说明任务不在可重试状态

View file

@ -0,0 +1,22 @@
## 1. 移除 Matrix bot 和通知格式化代码
- [ ] 1.1 删除 `src/integrations/matrix/` 目录
- [ ] 1.2 从 `src/integrations/mod.rs` 移除 `pub mod matrix;`
- [ ] 1.3 从 `src/main.rs` 移除 Matrix bot 启动逻辑
- [ ] 1.4 从 `Cargo.toml` 移除 `matrix-sdk` 依赖
- [ ] 1.5 从 `config.example.toml` 移除 `[matrix]` section
- [ ] 1.6 从 `src/config.rs` 移除 `MatrixConfig` struct 及相关字段
- [ ] 1.7 删除通知格式化函数和命令解析代码(如已从 matrix 模块导出)
## 2. 新增 HTTP API 端点
- [ ] 2.1 实现 `GET /api/v1/tasks`:返回任务列表 JSON支持 `status``agent_id` 查询参数过滤
- [ ] 2.2 实现 `POST /api/v1/tasks/{task_id}/retry`:对 failed/agent_lost 任务触发重新入队,非可重试状态返回 400
- [ ] 2.3 在 EventStore 中添加 `list_tasks(status: Option<&str>, agent_id: Option<&str>)` 查询方法
- [ ] 2.4 在 `src/main.rs` 注册新路由
## 3. 测试与验证
- [ ] 3.1 `cargo check` 通过(无 matrix-sdk 依赖)
- [ ] 3.2 `cargo test` 全部通过(移除 matrix 相关测试,新增 API 端点测试)
- [ ] 3.3 新增 API 端点测试:任务列表过滤、重试成功、重试失败场景

View file

@ -379,6 +379,58 @@ fn parse_task_source(source: &str) -> Option<(String, u64)> {
Some((repo.to_string(), issue_number)) Some((repo.to_string(), issue_number))
} }
#[derive(Debug, Deserialize)]
pub struct ListTasksQuery {
pub status: Option<String>,
pub agent_id: Option<String>,
}
pub async fn list_tasks(
State(state): State<AppState>,
Query(query): Query<ListTasksQuery>,
) -> Result<Json<Vec<Task>>, ApiError> {
let store = state.store.clone();
tokio::task::spawn_blocking(move || -> Result<Json<Vec<Task>>, ApiError> {
let store = store.lock().map_err(|e| ApiError::Poisoned(e.to_string()))?;
let tasks = store.list_tasks(query.status.as_deref(), query.agent_id.as_deref())?;
Ok(Json(tasks))
})
.await?
}
pub async fn retry_task(
State(state): State<AppState>,
axum::extract::Path(task_id): axum::extract::Path<String>,
) -> Result<Json<Task>, ApiError> {
let store = state.store.clone();
let sm = StateMachine::new(store.clone());
let task_id_for_check = task_id.clone();
let current = tokio::task::spawn_blocking(move || -> Result<Option<Task>, ApiError> {
let store = store.lock().map_err(|e| ApiError::Poisoned(e.to_string()))?;
Ok(store.read_task(&task_id_for_check)?)
})
.await??;
let task = current.ok_or_else(|| ApiError::NotFound(format!("task {}", task_id)))?;
if !matches!(task.status, TaskStatus::Failed | TaskStatus::AgentLost) {
return Err(ApiError::BadRequest(format!(
"task {} is not retryable (current status: {})",
task.task_id,
task.status.as_str()
)));
}
let updated = sm
.transition(&task_id, TaskStatus::Assigned, None, "retry")
.await
.map_err(|e| ApiError::BadRequest(e.to_string()))?;
Ok(Json(updated))
}
#[cfg(test)] #[cfg(test)]
mod tests { mod tests {
use super::*; use super::*;
@ -734,4 +786,149 @@ mod tests {
}; };
assert_eq!(task.status, TaskStatus::Running); assert_eq!(task.status, TaskStatus::Running);
} }
// ─── Task API tests ─────────────────────────────────────────
fn sample_task_variant(task_id: &str, status: TaskStatus, agent_id: Option<&str>) -> Task {
Task {
task_id: task_id.to_string(),
source: format!("forgejo:org/repo#{task_id}"),
task_type: "code".into(),
priority: Priority::High,
status,
assigned_agent_id: agent_id.map(String::from),
requirements: "do something".into(),
labels: vec!["agent:code".into(), "priority:high".into()],
created_at: Utc::now(),
assigned_at: None,
started_at: None,
completed_at: None,
retry_count: 0,
max_retries: 2,
timeout_seconds: 1800,
}
}
#[tokio::test]
async fn list_tasks_returns_all_tasks() {
let (_dir, state) = test_store();
{
let store = state.store.lock().unwrap();
store.insert_task(&sample_task_variant("task-1", TaskStatus::Created, None)).unwrap();
store.insert_task(&sample_task_variant("task-2", TaskStatus::Running, Some("worker-01"))).unwrap();
}
let tasks = list_tasks(
State(state),
Query(ListTasksQuery { status: None, agent_id: None }),
)
.await
.unwrap();
assert_eq!(tasks.0.len(), 2);
}
#[tokio::test]
async fn list_tasks_filters_by_status() {
let (_dir, state) = test_store();
{
let store = state.store.lock().unwrap();
store.insert_task(&sample_task_variant("task-1", TaskStatus::Created, None)).unwrap();
store.insert_task(&sample_task_variant("task-2", TaskStatus::Running, Some("worker-01"))).unwrap();
}
let tasks = list_tasks(
State(state),
Query(ListTasksQuery { status: Some("running".into()), agent_id: None }),
)
.await
.unwrap();
assert_eq!(tasks.0.len(), 1);
assert_eq!(tasks.0[0].task_id, "task-2");
assert_eq!(tasks.0[0].status, TaskStatus::Running);
}
#[tokio::test]
async fn list_tasks_filters_by_agent() {
let (_dir, state) = test_store();
{
let store = state.store.lock().unwrap();
store.insert_task(&sample_task_variant("task-1", TaskStatus::Running, Some("worker-01"))).unwrap();
store.insert_task(&sample_task_variant("task-2", TaskStatus::Running, Some("worker-02"))).unwrap();
}
let tasks = list_tasks(
State(state),
Query(ListTasksQuery { status: None, agent_id: Some("worker-01".into()) }),
)
.await
.unwrap();
assert_eq!(tasks.0.len(), 1);
assert_eq!(tasks.0[0].task_id, "task-1");
}
#[tokio::test]
async fn retry_task_succeeds_for_failed_task() {
let (_dir, state) = test_store();
{
let store = state.store.lock().unwrap();
store.insert_task(&sample_task_variant("task-failed", TaskStatus::Failed, Some("worker-01"))).unwrap();
}
let updated = retry_task(State(state.clone()), axum::extract::Path("task-failed".to_string()))
.await
.unwrap();
assert_eq!(updated.0.status, TaskStatus::Assigned);
// Verify in DB
let task = {
let store = state.store.lock().unwrap();
store.read_task("task-failed").unwrap().unwrap()
};
assert_eq!(task.status, TaskStatus::Assigned);
}
#[tokio::test]
async fn retry_task_succeeds_for_agent_lost_task() {
let (_dir, state) = test_store();
{
let store = state.store.lock().unwrap();
store.insert_task(&sample_task_variant("task-lost", TaskStatus::AgentLost, Some("worker-01"))).unwrap();
}
let updated = retry_task(State(state.clone()), axum::extract::Path("task-lost".to_string()))
.await
.unwrap();
assert_eq!(updated.0.status, TaskStatus::Assigned);
}
#[tokio::test]
async fn retry_task_rejects_non_retryable_status() {
let (_dir, state) = test_store();
{
let store = state.store.lock().unwrap();
store.insert_task(&sample_task_variant("task-running", TaskStatus::Running, Some("worker-01"))).unwrap();
}
let err = retry_task(State(state.clone()), axum::extract::Path("task-running".to_string()))
.await
.unwrap_err();
assert!(matches!(err, ApiError::BadRequest(_)));
}
#[tokio::test]
async fn retry_task_returns_not_found_for_missing_task() {
let (_dir, state) = test_store();
let err = retry_task(State(state), axum::extract::Path("nonexistent".to_string()))
.await
.unwrap_err();
assert!(matches!(err, ApiError::NotFound(_)));
}
} }

View file

@ -6,7 +6,6 @@ use crate::adapters::AdapterInstanceConfig;
pub struct Config { pub struct Config {
pub server: ServerConfig, pub server: ServerConfig,
pub forgejo: ForgejoConfig, pub forgejo: ForgejoConfig,
pub matrix: MatrixConfig,
pub orchestrator: OrchestratorConfig, pub orchestrator: OrchestratorConfig,
#[serde(default)] #[serde(default)]
pub adapters: Vec<AdapterInstanceConfig>, pub adapters: Vec<AdapterInstanceConfig>,
@ -25,14 +24,6 @@ pub struct ForgejoConfig {
pub webhook_secret: String, pub webhook_secret: String,
} }
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct MatrixConfig {
pub homeserver_url: String,
pub user_id: String,
pub access_token: String,
pub room_id: String,
}
#[derive(Debug, Clone, Serialize, Deserialize)] #[derive(Debug, Clone, Serialize, Deserialize)]
pub struct OrchestratorConfig { pub struct OrchestratorConfig {
pub db_path: String, pub db_path: String,
@ -54,12 +45,6 @@ impl Default for Config {
token: String::new(), token: String::new(),
webhook_secret: String::new(), webhook_secret: String::new(),
}, },
matrix: MatrixConfig {
homeserver_url: "https://matrix.0x08.org".into(),
user_id: "@jeeves:0x08.org".into(),
access_token: String::new(),
room_id: String::new(),
},
orchestrator: OrchestratorConfig { orchestrator: OrchestratorConfig {
db_path: "data/agent-fleet.db".into(), db_path: "data/agent-fleet.db".into(),
heartbeat_interval_secs: 60, heartbeat_interval_secs: 60,

View file

@ -275,6 +275,36 @@ impl EventStore {
.collect::<SqlResult<Vec<_>>>() .collect::<SqlResult<Vec<_>>>()
} }
pub fn list_tasks(
&self,
status: Option<&str>,
agent_id: Option<&str>,
) -> SqlResult<Vec<Task>> {
let mut sql = String::from(
"SELECT task_id, source, task_type, priority, status, assigned_agent_id,
requirements, labels, created_at, assigned_at, started_at, completed_at,
retry_count, max_retries, timeout_seconds
FROM tasks WHERE 1=1",
);
let mut param_values: Vec<Box<dyn rusqlite::types::ToSql>> = Vec::new();
if let Some(s) = status {
sql.push_str(" AND status = ?");
param_values.push(Box::new(s.to_string()));
}
if let Some(a) = agent_id {
sql.push_str(" AND assigned_agent_id = ?");
param_values.push(Box::new(a.to_string()));
}
sql.push_str(" ORDER BY created_at DESC");
let params: Vec<&dyn rusqlite::types::ToSql> = param_values.iter().map(|p| p.as_ref()).collect();
let mut stmt = self.conn.prepare(&sql)?;
stmt.query_map(params.as_slice(), Self::row_to_task)?
.collect::<SqlResult<Vec<_>>>()
}
// ─── Task/event write operations ───────────────────────────── // ─── Task/event write operations ─────────────────────────────
pub fn insert_task(&self, task: &Task) -> SqlResult<()> { pub fn insert_task(&self, task: &Task) -> SqlResult<()> {

View file

@ -1,531 +0,0 @@
use std::collections::HashMap;
use std::sync::{Arc, Mutex};
use matrix_sdk::authentication::matrix::MatrixSession;
use matrix_sdk::config::SyncSettings;
use matrix_sdk::ruma::events::room::message::{
MessageType, OriginalSyncRoomMessageEvent, RoomMessageEventContent,
};
use matrix_sdk::ruma::events::relation::Thread;
use matrix_sdk::ruma::OwnedEventId;
use matrix_sdk::ruma::{OwnedDeviceId, OwnedRoomId, OwnedUserId, UserId};
use matrix_sdk::{Client, Room};
use tokio::sync::RwLock;
use crate::config::MatrixConfig;
use crate::core::event_store::EventStore;
use crate::core::models::{Agent, TaskStatus};
use crate::core::state_machine::StateMachine;
/// The bot context — shared state for all handlers.
#[derive(Clone)]
pub struct BotContext {
pub store: Arc<Mutex<EventStore>>,
pub sm: Arc<StateMachine>,
pub config: MatrixConfig,
/// Maps agent_id → root event_id for per-agent threads.
pub agent_threads: Arc<RwLock<HashMap<String, OwnedEventId>>>,
}
// ─── Command parsing ────────────────────────────────────────────
/// A parsed Matrix command.
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum BotCommand {
FleetStatus,
Assign { agent_id: String, issue_ref: String },
Retry { issue_ref: String },
Unknown(String),
}
/// Parse a Matrix message body into a BotCommand.
/// Only returns Some for known command prefixes: fleet, assign, retry.
pub fn parse_command(body: &str) -> Option<BotCommand> {
let body = body.trim();
// Accept both "/fleet status" and "fleet status" (in case the / is stripped)
let body = body.strip_prefix('/').unwrap_or(body);
let mut parts = body.splitn(4, ' ');
let first = parts.next()?.to_lowercase();
match first.as_str() {
"fleet" => {
let sub = parts.next()?.to_lowercase();
match sub.as_str() {
"status" => Some(BotCommand::FleetStatus),
_ => Some(BotCommand::Unknown(format!("fleet {sub}"))),
}
}
"assign" => {
let agent_id = parts.next()?.to_string();
let issue_ref = parts.next()?.to_string();
Some(BotCommand::Assign { agent_id, issue_ref })
}
"retry" => {
let issue_ref = parts.next()?.to_string();
Some(BotCommand::Retry { issue_ref })
}
_ => None,
}
}
// ─── Notification formatting ────────────────────────────────────
/// Format a task-assigned notification.
///
/// Spec: `📋 #42 → worker-03 [code:typescript]`
pub fn format_task_assigned(task_id: &str, agent_id: &str, task_type: &str) -> String {
format!("📋 {task_id}{agent_id} [{task_type}]")
}
/// Format a task-completed notification.
///
/// Spec: `✅ #42 completed by worker-03 — PR #15 — "修复登录验证 bug"`
pub fn format_task_completed(task_id: &str, agent_id: &str, summary: &str, artifact_hint: Option<&str>) -> String {
match artifact_hint {
Some(hint) => format!("{task_id} completed by {agent_id}{hint}\"{summary}\""),
None => format!("{task_id} completed by {agent_id}\"{summary}\""),
}
}
/// Format a task-failed notification.
///
/// Spec: `❌ #42 failed — worker-03 — "构建超时"`
pub fn format_task_failed(task_id: &str, agent_id: &str, error: &str) -> String {
format!("{task_id} failed — {agent_id}\"{error}\"")
}
/// Format an agent-offline alert.
///
/// Spec: `⚠️ worker-03 offline — 2 running tasks affected`
pub fn format_agent_offline(agent_id: &str, affected_tasks: usize) -> String {
format!("⚠️ {agent_id} offline — {affected_tasks} running tasks affected")
}
/// Format the fleet status table as plain text.
pub fn format_fleet_status(agents: &[Agent]) -> String {
if agents.is_empty() {
return "No agents registered.".to_string();
}
let mut lines = vec!["Agent ID | Type | Status | Tasks | Capabilities".to_string()];
lines.push("-----------------+--------------+---------+-------+---------------------------".to_string());
for agent in agents {
let caps = agent.capabilities.join(", ");
let caps_display = if caps.len() > 27 {
format!("{}", &caps[..24])
} else {
caps
};
lines.push(format!(
"{:<16} | {:<12} | {:<7} | {:<5} | {}",
truncate_str(&agent.agent_id, 16),
truncate_str(agent.agent_type.as_str(), 12),
agent.status.as_str(),
agent.current_tasks,
caps_display,
));
}
lines.join("\n")
}
fn truncate_str(s: &str, max_len: usize) -> String {
if s.len() <= max_len {
s.to_string()
} else {
format!("{}", &s[..max_len - 1])
}
}
// ─── Bot lifecycle ──────────────────────────────────────────────
/// Build a Matrix client from config.
pub async fn build_client(config: &MatrixConfig) -> Result<Client, Box<dyn std::error::Error + Send + Sync>> {
let client = Client::builder()
.homeserver_url(&config.homeserver_url)
.build()
.await?;
let user_id = UserId::parse(&config.user_id)?;
let device_id: OwnedDeviceId = "AGENTFLEET".into();
let session = MatrixSession {
meta: matrix_sdk::SessionMeta {
user_id,
device_id,
},
tokens: matrix_sdk::authentication::matrix::MatrixSessionTokens {
access_token: config.access_token.clone(),
refresh_token: None,
},
};
client.restore_session(session).await?;
Ok(client)
}
/// Start the Matrix bot — spawns a background sync loop.
pub async fn start_bot(
config: MatrixConfig,
store: Arc<Mutex<EventStore>>,
sm: Arc<StateMachine>,
) -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
let room_id = OwnedRoomId::try_from(config.room_id.as_str())?;
let client = build_client(&config).await?;
// Join the room
let _ = client.join_room_by_id(&room_id).await?;
let ctx = BotContext {
store,
sm,
config,
agent_threads: Arc::new(RwLock::new(HashMap::new())),
};
// Register the message handler
client.add_event_handler(move |ev: OriginalSyncRoomMessageEvent, room: Room| {
let ctx = ctx.clone();
async move {
handle_message(ev, room, ctx).await;
}
});
// Spawn the sync loop in background
tokio::spawn(async move {
loop {
match client.sync(SyncSettings::default()).await {
Ok(_) => {}
Err(e) => {
tracing::error!("Matrix sync error: {e}");
tokio::time::sleep(std::time::Duration::from_secs(5)).await;
}
}
}
});
tracing::info!("Matrix bot started, monitoring room {}", room_id);
Ok(())
}
/// Handle an incoming Matrix room message.
async fn handle_message(
ev: OriginalSyncRoomMessageEvent,
room: Room,
ctx: BotContext,
) {
// Only handle text messages
let text = match &ev.content.msgtype {
MessageType::Text(text) => text.body.clone(),
_ => return,
};
// Skip our own messages
if ev.sender == ctx.config.user_id.parse::<OwnedUserId>().unwrap_or_else(|_| ev.sender.clone()) {
return;
}
let Some(cmd) = parse_command(&text) else {
return;
};
tracing::debug!(?cmd, "received matrix command");
match cmd {
BotCommand::FleetStatus => {
let agents = {
let store = ctx.store.lock().unwrap();
store.list_agents(None, None).unwrap_or_default()
};
let table = format_fleet_status(&agents);
send_plain(&room, &table).await;
}
BotCommand::Assign { agent_id, issue_ref } => {
let result = {
let store = ctx.store.lock().unwrap();
let task = store.read_task(&issue_ref);
let agent = store.find_agent_by_id(&agent_id);
(task, agent)
};
match result {
(Ok(Some(_task)), Ok(Some(_agent))) => {
match ctx.sm.transition(&issue_ref, TaskStatus::Assigned, Some(&agent_id), "manual assign via matrix").await {
Ok(updated) => {
let msg = format_task_assigned(&updated.task_id, &agent_id, &updated.task_type);
send_plain(&room, &msg).await;
}
Err(e) => {
send_plain(&room, &format!("❌ Failed to assign: {e}")).await;
}
}
}
(Ok(None), _) => {
send_plain(&room, &format!("Task not found: {issue_ref}")).await;
}
(_, Ok(None)) => {
send_plain(&room, &format!("Agent not found: {agent_id}")).await;
}
(Err(e), _) | (_, Err(e)) => {
send_plain(&room, &format!("Database error: {e}")).await;
}
}
}
BotCommand::Retry { issue_ref } => {
let task_status = {
let store = ctx.store.lock().unwrap();
store.read_task(&issue_ref).ok().flatten()
};
let Some(task) = task_status else {
send_plain(&room, &format!("Task not found: {issue_ref}")).await;
return;
};
if !matches!(task.status, TaskStatus::Failed | TaskStatus::AgentLost) {
send_plain(&room, &format!("Task {issue_ref} is not in a retryable state (current: {})", task.status.as_str())).await;
return;
}
// Reset to created (re-enqueue)
match ctx.sm.transition(&issue_ref, TaskStatus::Assigned, None, "retry via matrix").await {
Ok(_) => {
send_plain(&room, &format!("🔄 Task {issue_ref} re-queued for assignment")).await;
}
Err(e) => {
send_plain(&room, &format!("❌ Failed to retry: {e}")).await;
}
}
}
BotCommand::Unknown(cmd) => {
send_plain(&room, &format!("Unknown command: `{cmd}`. Available: /fleet status, /assign <agent> <issue>, /retry <issue>")).await;
}
}
}
/// Send a plain text message to a room.
async fn send_plain(room: &Room, body: &str) {
let content = RoomMessageEventContent::text_plain(body);
if let Err(e) = room.send(content).await {
tracing::error!("Failed to send Matrix message: {e}");
}
}
/// Send a message as a thread reply (per-agent thread).
pub async fn send_thread_message(
room: &Room,
thread_root_event_id: &ruma::OwnedEventId,
body: &str,
) {
let mut content = RoomMessageEventContent::text_plain(body);
content.relates_to = Some(matrix_sdk::ruma::events::room::message::Relation::Thread(
Thread::without_fallback(thread_root_event_id.clone()),
));
if let Err(e) = room.send(content).await {
tracing::error!("Failed to send thread message: {e}");
}
}
/// Post a notification for a task event to the Matrix room.
/// This is a helper meant to be called from the API layer.
pub async fn notify_task_event(
room: &Room,
event_type: &str,
task_id: &str,
agent_id: &str,
task_type: &str,
summary: Option<&str>,
error: Option<&str>,
artifact_hint: Option<&str>,
) {
let msg = match event_type {
"task.assigned" => format_task_assigned(task_id, agent_id, task_type),
"task.completed" => format_task_completed(
task_id,
agent_id,
summary.unwrap_or(""),
artifact_hint,
),
"task.failed" => format_task_failed(
task_id,
agent_id,
error.unwrap_or("unknown error"),
),
_ => return,
};
send_plain(room, &msg).await;
}
/// Post an agent-offline alert to the Matrix room.
pub async fn notify_agent_offline(room: &Room, agent_id: &str, affected_tasks: usize) {
let msg = format_agent_offline(agent_id, affected_tasks);
send_plain(room, &msg).await;
}
#[cfg(test)]
mod tests {
use super::*;
use crate::core::models::{Agent, AgentType, AgentStatus};
use chrono::Utc;
use std::collections::HashMap;
// ─── Command parsing tests ──────────────────────────────────
#[test]
fn parse_fleet_status() {
assert_eq!(parse_command("/fleet status"), Some(BotCommand::FleetStatus));
assert_eq!(parse_command("fleet status"), Some(BotCommand::FleetStatus));
assert_eq!(parse_command("/FLEET STATUS"), Some(BotCommand::FleetStatus));
assert_eq!(parse_command(" /fleet status "), Some(BotCommand::FleetStatus));
}
#[test]
fn parse_assign_command() {
assert_eq!(
parse_command("/assign worker-03 org/repo#42"),
Some(BotCommand::Assign {
agent_id: "worker-03".into(),
issue_ref: "org/repo#42".into(),
})
);
assert_eq!(
parse_command("assign worker-03 org/repo#42"),
Some(BotCommand::Assign {
agent_id: "worker-03".into(),
issue_ref: "org/repo#42".into(),
})
);
}
#[test]
fn parse_retry_command() {
assert_eq!(
parse_command("/retry org/repo#42"),
Some(BotCommand::Retry {
issue_ref: "org/repo#42".into(),
})
);
assert_eq!(
parse_command("retry #42"),
Some(BotCommand::Retry {
issue_ref: "#42".into(),
})
);
}
#[test]
fn parse_unknown_command() {
// Non-fleet/assign/retry prefixes return None (not a bot command)
assert_eq!(parse_command("/deploy prod"), None);
}
#[test]
fn parse_non_command_returns_none() {
assert_eq!(parse_command("hello world"), None);
assert_eq!(parse_command(""), None);
assert_eq!(parse_command("just chatting"), None);
}
#[test]
fn parse_fleet_subcommand_unknown() {
assert_eq!(
parse_command("/fleet deploy"),
Some(BotCommand::Unknown("fleet deploy".into()))
);
}
// ─── Notification formatting tests ──────────────────────────
#[test]
fn task_assigned_format() {
let msg = format_task_assigned("org/repo#42", "worker-03", "code:typescript");
assert_eq!(msg, "📋 org/repo#42 → worker-03 [code:typescript]");
}
#[test]
fn task_completed_format_with_artifact() {
let msg = format_task_completed(
"org/repo#42",
"worker-03",
"修复登录验证 bug",
Some("PR #15"),
);
assert_eq!(
msg,
"✅ org/repo#42 completed by worker-03 — PR #15 — \"修复登录验证 bug\""
);
}
#[test]
fn task_completed_format_without_artifact() {
let msg = format_task_completed(
"org/repo#42",
"worker-03",
"fixed the thing",
None,
);
assert_eq!(msg, "✅ org/repo#42 completed by worker-03 — \"fixed the thing\"");
}
#[test]
fn task_failed_format() {
let msg = format_task_failed("org/repo#42", "worker-03", "构建超时");
assert_eq!(msg, "❌ org/repo#42 failed — worker-03 — \"构建超时\"");
}
#[test]
fn agent_offline_format() {
let msg = format_agent_offline("worker-03", 2);
assert_eq!(msg, "⚠️ worker-03 offline — 2 running tasks affected");
}
// ─── Fleet status table tests ───────────────────────────────
fn sample_agent(id: &str, status: AgentStatus, tasks: u32) -> Agent {
Agent {
agent_id: id.to_string(),
agent_type: AgentType::CodexCli,
hostname: "host-01".into(),
capabilities: vec!["code:rust".into(), "review".into()],
max_concurrency: 2,
current_tasks: tasks,
status,
last_heartbeat_at: Utc::now(),
registered_at: Utc::now(),
metadata: HashMap::new(),
}
}
#[test]
fn fleet_status_empty() {
let table = format_fleet_status(&[]);
assert_eq!(table, "No agents registered.");
}
#[test]
fn fleet_status_with_agents() {
let agents = vec![
sample_agent("worker-01", AgentStatus::Online, 1),
sample_agent("worker-02", AgentStatus::Offline, 0),
];
let table = format_fleet_status(&agents);
assert!(table.contains("worker-01"));
assert!(table.contains("worker-02"));
assert!(table.contains("online"));
assert!(table.contains("offline"));
assert!(table.contains("code:rust"));
assert!(table.contains("review"));
}
#[test]
fn fleet_status_table_has_header() {
let agents = vec![sample_agent("w1", AgentStatus::Online, 0)];
let table = format_fleet_status(&agents);
assert!(table.contains("Agent ID"));
assert!(table.contains("Type"));
assert!(table.contains("Status"));
assert!(table.contains("Tasks"));
assert!(table.contains("Capabilities"));
}
}

View file

@ -1,2 +1 @@
pub mod forgejo; pub mod forgejo;
pub mod matrix;

View file

@ -87,10 +87,15 @@ async fn main() {
let app = axum::Router::new() let app = axum::Router::new()
.route("/healthz", axum::routing::get(|| async { "ok" })) .route("/healthz", axum::routing::get(|| async { "ok" }))
// Agent registry
.route("/api/v1/agents/register", axum::routing::post(api::register_agent)) .route("/api/v1/agents/register", axum::routing::post(api::register_agent))
.route("/api/v1/agents/heartbeat", axum::routing::post(api::heartbeat)) .route("/api/v1/agents/heartbeat", axum::routing::post(api::heartbeat))
.route("/api/v1/agents/deregister", axum::routing::post(api::deregister)) .route("/api/v1/agents/deregister", axum::routing::post(api::deregister))
.route("/api/v1/agents", axum::routing::get(api::list_agents)) .route("/api/v1/agents", axum::routing::get(api::list_agents))
// Task management
.route("/api/v1/tasks", axum::routing::get(api::list_tasks))
.route("/api/v1/tasks/{task_id}/retry", axum::routing::post(api::retry_task))
// Receipts & webhooks
.route("/api/v1/receipts", axum::routing::post(api::submit_receipt)) .route("/api/v1/receipts", axum::routing::post(api::submit_receipt))
.route( .route(
"/api/v1/webhooks/forgejo", "/api/v1/webhooks/forgejo",
@ -106,21 +111,5 @@ async fn main() {
.expect("failed to bind"); .expect("failed to bind");
tracing::info!("listening on {}", listener.local_addr().unwrap()); tracing::info!("listening on {}", listener.local_addr().unwrap());
// Start Matrix bot
if !config.matrix.access_token.is_empty() && !config.matrix.room_id.is_empty() {
let matrix_cfg = config.matrix.clone();
let matrix_store = store.clone();
let matrix_sm = state_machine.clone();
tokio::spawn(async move {
match crate::integrations::matrix::start_bot(matrix_cfg, matrix_store, matrix_sm).await {
Ok(_) => tracing::info!("Matrix bot stopped"),
Err(e) => tracing::error!("Matrix bot error: {e}"),
}
});
} else {
tracing::info!("Matrix bot disabled (no access_token or room_id configured)");
}
axum::serve(listener, app).await.expect("server error"); axum::serve(listener, app).await.expect("server error");
} }