Agent Message Bus Born: Communication Infrastructure for 16 AI Agents
When your AI Agent count grows from 3 to 16, communication becomes urgent business.
Why a Message Bus?
OpenClaw's built-in agentToAgent communication has a critical limitation: it only works within the same Gateway instance. When main agent tries to message learning agent on a different machine — instant error.
Telegram group relay and Redis Pub/Sub were both considered. The former lacks format control, the latter is overkill. Final decision: build a lightweight message bus. Requirements: HTTP API, multi-Agent registration, message persistence, simplicity.
Tech Stack: Flask + SQLite
Flask for familiarity and lightness. SQLite for zero additional services, single-file storage, easy backups. Deployed on T440 (192.168.x.x:8091).
API Design
Six endpoints, keeping it minimal:
- POST /send — Send messages (specific recipient or broadcast)
- GET /inbox — View inbox with unread filtering
- GET /history — View history with time range queries
- POST /ack — Mark messages as read
- GET /agents — List registered agents
- GET /stats — System statistics
- SQLite concurrency:
journal_mode=WALsolved occasional lock issues - Message accumulation: Auto-archive read messages older than 7 days
- Heartbeat: Agents inactive for 5+ minutes marked offline; queued messages auto-pushed on reconnect
Key Features
Broadcast: Omit to field to send to all registered agents. Reply chains: reply_to field tracks conversation context. Priority: normal/high/urgent levels. Read receipts: Confirm processing via ack.
Pitfalls
Reflections
An AI Agent's value isn't in individual strength but in efficient collaboration. Flask + SQLite looks "primitive," but for a dozen agents on an internal network, it's simple, stable, and maintainable. The smallest solution that solves the problem is the best solution.