OpenClaw Architecture: Components, Languages, and How They Work Together
OpenClaw is built on a distributed component architecture designed to unify multiple messaging platforms while maintaining clear separation of concerns. This article breaks down each major component, the programming languages that power them, how they communicate with each other, and how background job scheduling works throughout the system.
The Complete System Architecture
The OpenClaw system consists of five major components working in concert:
1. OpenClaw Gateway Process (Node.js/JavaScript) 2. Channel Adapter Components (Node.js/JavaScript, one per platform) 3. Session Manager and Router (Node.js/JavaScript, runs inside Gateway) 4. Local Agent Executor Service (Any language: Node.js, Python, Go, etc.) 5. Persistence and State Layer (SQLite/File-based, accessed by Gateway)
These components communicate via HTTP, webhooks, and internal function calls. The entire system runs on your local machine or server, with no cloud intermediary required.
Programming Languages Used
Understanding the language stack is crucial to understanding how components interact.
OpenClaw Gateway: Written entirely in Node.js/JavaScript. This is the core orchestrator that listens for messages, routes them, and manages sessions. Node.js was chosen for its event-driven, non-blocking I/O model—essential for handling multiple concurrent connections from different messaging platforms simultaneously without blocking.
Channel Adapters: Also written in Node.js/JavaScript. Each channel (WhatsApp, Telegram, Discord, iMessage) has its own adapter that translates between the platform’s API and OpenClaw’s internal message format. Running them in the same Node.js process as the Gateway reduces inter-process communication overhead.
Session Manager and Router: Part of the Node.js Gateway process. Handles session state, message routing, and timeout management in JavaScript. No separate process needed here—it’s embedded logic.
Local Agent Executor: Language-agnostic. The bundled agent (Pi) is Node.js-based, but you can plug in agents written in any language—Python, Go, Rust, etc. The Gateway communicates with the agent via HTTP RPC calls, so the language doesn’t matter. As long as the agent exposes an HTTP endpoint that accepts messages and returns responses, it works.
Persistence Layer: SQLite (or file-based JSON, depending on configuration). Not a separate process—accessed directly by the Gateway via Node.js libraries.
Component Breakdown: How Each Piece Works
The OpenClaw Gateway Process
The Gateway is the central coordinator. When you start OpenClaw, you’re starting this Node.js process. It does three main jobs simultaneously:
Job 1: Listen to Channels
The Gateway maintains persistent connections (or polling loops) to each enabled messaging platform. For WhatsApp, it listens for webhooks. For Telegram, it can either poll the API or listen to webhooks. For Discord, it maintains a WebSocket connection. For iMessage, it listens to macOS system events.
The Gateway acts like a receptionist—when a message arrives from any platform, it captures it and standardizes it into an internal format. No matter where the message came from, the Gateway now has a consistent object it can work with.
Job 2: Route and Execute
Once a message arrives, the Gateway’s routing logic kicks in. It asks: “Does a session exist for this user and channel combination?” If yes, it reuses the existing session. If no, it creates a new one. The session acts as a container for conversation state, history, and any user-specific configuration.
The Gateway then sends the message to the assigned agent (via HTTP POST request). The agent processes it, runs tools if needed, and returns a response. The Gateway receives this response and prepares to send it back.
Job 3: Send Responses and Manage State
The Gateway routes the agent’s response back to the original channel. It uses the appropriate channel adapter to format the response for that platform’s API and sends it. Meanwhile, it updates the session record with the new message and response, storing them for conversation history.
The Gateway also manages session timeouts. If a session hasn’t had activity for N minutes (configurable), the Gateway marks it for cleanup and eventually destroys it.
Visual Placeholder:
[Message arrives from Telegram/WhatsApp/Discord]
↓
[Gateway: Listen, Standardize, Extract user_id, channel]
↓
[Session Manager: Look up session for (user_id, channel)]
↓
[Router: Send to Agent via HTTP POST]
↓
[Agent: Process, Execute Tools, Return Response]
↓
[Gateway: Send Response back to Channel via API]
↓
[Persistence: Save Message and Response to Session History]Channel Adapter Components
Each messaging platform (WhatsApp, Telegram, Discord, iMessage) has a Channel Adapter. These are thin translation layers—they don’t do business logic; they just handle platform-specific quirks.
The Telegram Channel Adapter
Telegram’s adapter handles several responsibilities. First, it listens for incoming updates from Telegram’s servers. This can be done via webhooks (Telegram calls you) or polling (you ask Telegram every N seconds for new messages). The adapter translates Telegram’s update format into OpenClaw’s internal message format.
Example: Telegram sends a complex JSON object with nested user data, chat data, and message data. The Telegram adapter extracts just what matters—the user ID, the message text, media if present, and the timestamp. It outputs something like:
{
channel: "telegram",
user_id: "7123566776",
text: "Hello OpenClaw",
media: null,
timestamp: 1703275200
}Second, the Telegram adapter handles sending responses. When the Gateway gives it a response to send, the adapter formats it according to Telegram’s API requirements and makes the HTTP call to Telegram’s servers.
Third, it respects Telegram-specific constraints—rate limits, message size limits, supported media types, etc.
Other Channel Adapters
WhatsApp’s adapter handles webhooks from the WhatsApp Cloud API, media uploads/downloads, and phone number formatting. Discord’s adapter manages guilds, channels, mentions, and WebSocket connections. iMessage’s adapter is macOS-specific and requires special handling. But all adapters follow the same pattern: listen, translate, execute, send back.
Why This Design?
By isolating each channel into its own adapter, OpenClaw can support multiple platforms without duplicating logic. If you want to add Slack, you write a new adapter. The Gateway doesn’t change. Your agents don’t change. Only the new adapter needs to exist.
Session Manager and Router
The Session Manager is logic within the Gateway that tracks and manages conversation sessions. A session is a stateful container for a single user’s conversation with a single agent.
What a Session Contains
Each session stores:
- Session ID (unique identifier)
- User ID and Channel (who is talking, and from where)
- Assigned Agent (which agent is handling this user)
- Message History (all prior messages and responses in this conversation)
- User Context (preferences, allowlists, any stored metadata)
- Timeout Marker (when this session expires if inactive)
- Session State (active, idle, terminated)
How the Router Works
When a message arrives, the Router looks at the sender’s user ID and channel. It constructs a lookup key: (channel, user_id). It checks the in-memory session store (or persistent storage) for an existing session with that key.
If found: The session is reused. Its timeout is reset. The message is added to the session’s history, and the whole session is sent to the agent.
If not found: A new session is created. An agent is assigned (either the default one or based on routing rules). The session is stored. The message is added to the empty history, and the session is sent to the agent.
Session Isolation
This is crucial: each user gets their own session, and therefore their own agent instance (or shared agent instance, depending on configuration). Alice’s session doesn’t leak into Bob’s. If Alice’s agent crashes, Bob’s agent keeps running. This isolation is a core security and reliability feature.
Timeout and Cleanup
Sessions have a configurable idle timeout (default: 15 minutes). If no messages arrive during that period, the session is marked for cleanup. The cleanup process writes the final session state to persistent storage (for history/logging), then destroys the in-memory session object. Resources are freed.
If a user comes back after their session has timed out, a new session is created. They don’t see old history unless you’ve configured the system to restore sessions from disk.
Local Agent Executor Service
The Agent Executor is the component that actually generates responses. It’s language-agnostic because the Gateway talks to it via HTTP.
How the Agent is Called
The Gateway makes an HTTP POST request to the agent with the current session’s state:
POST http://localhost:PORT/rpc/message
Content-Type: application/json
{
"sessionId": "sess_abc123",
"messages": [
{"role": "user", "content": "What's 2+2?"},
{"role": "assistant", "content": "4"},
{"role": "user", "content": "And 3+3?"}
],
"stream": true
}The agent receives this, processes it, and returns responses (either as a stream or a single response object, depending on configuration).
What the Agent Does Internally
This depends on the agent implementation. The bundled Pi agent (Node.js-based) does several things: it parses the messages, decides whether to call an LLM (language model) or execute a tool, manages context, and streams back tokens as they’re generated. A custom Python agent might do something entirely different.
The key point: the Gateway doesn’t care. It just sends messages and collects responses.
Agent Execution
The Gateway sends each message to the same agent process running on your machine. The agent handles one request at a time (or in parallel, depending on the agent’s internal design). All users share the same agent—there’s no per-user agent isolation. If the agent crashes, all users lose access until it restarts.
Persistence and State Layer
OpenClaw needs to remember things even after the Gateway process restarts. The Persistence Layer handles this.
What Gets Persisted
Session data is written to disk (SQLite database or JSON files) at key moments:
- When a session is created
- After each message exchange (to preserve history)
- When a session is terminated (for archival)
The frequency of writes is configurable. You can write after every message (slower, safer) or batch writes (faster, riskier).
Persistent Storage Format
Sessions are typically stored in SQLite for efficiency and queryability. A session record looks roughly like:
{
id: "sess_abc123",
user_id: "7123566776",
channel: "telegram",
created_at: 1703275200,
last_activity: 1703275500,
messages: [
{from: "user", content: "Hello", timestamp: 1703275200},
{from: "assistant", content: "Hi there!", timestamp: 1703275210}
],
metadata: {...}
}Recovery on Restart
When the Gateway restarts, it can optionally restore sessions from persistent storage. Active sessions are rehydrated from disk. Users don’t lose context; they just experience a brief disconnect while the Gateway rebuilds state.
Communication Between Components
Understanding how components talk to each other is essential to understanding the architecture.
Gateway ↔ Channels (Webhook/Polling)
Channels listen for incoming messages from external platforms (Telegram API, WhatsApp API, etc.). When a message arrives, the channel passes it to the Gateway’s message handler. This is either:
- Webhook mode: The messaging platform calls the Gateway’s webhook endpoint. The channel handler processes the request and passes the standardized message to the Gateway’s router.
- Polling mode: The channel adapter periodically asks the messaging platform “any new messages?” and passes them to the Gateway if found.
For sending, the Gateway calls out to the channel: “Send this response to this user on this platform.” The channel formats the response and calls the platform’s API.
Gateway ↔ Agent Executor (HTTP RPC)
The Gateway makes HTTP POST requests to a single agent endpoint (usually localhost). The agent runs on the same machine as the Gateway.
The agent responds with the generated message (and any metadata about tool calls, reasoning, etc.). The Gateway collects the response and sends it back to the originating channel.
Gateway ↔ Session Manager (Internal Function Calls)
The Session Manager is part of the Gateway process, so communication is via normal JavaScript function calls. When the Gateway receives a message, it calls SessionManager.getOrCreate(). When it needs to save a message, it calls SessionManager.appendMessage(). These are synchronous or Promise-based async calls, not network requests.
Gateway ↔ Persistence Layer (Direct Reads/Writes)
The Persistence Layer (SQLite or file system) is accessed by the Gateway via Node.js database libraries. When the Gateway needs to save session history, it performs a write query. When it needs to restore a session, it performs a read query. No network involved—it’s direct file system access.
Cron Jobs and Scheduled Tasks
OpenClaw supports background scheduling in two ways.
Agent-Level Cron Jobs
Agents can schedule periodic tasks. For example, Pi (the bundled agent) can be told: “Every morning at 9 AM, send a daily summary to the user.” This is implemented as part of the agent logic. The agent has its own scheduler (or uses a library like node-cron for Node.js agents, or APScheduler for Python agents).
When it’s time for the task to run, the agent generates a message proactively. This message goes to the Gateway’s message handler as if it came from a user, but with a special flag indicating it’s system-generated. The message is routed to the appropriate channel and sent to the user.
Gateway-Level Cron Jobs
The Gateway can also schedule tasks. This is useful for:
- Session cleanup (every 5 minutes, check for timed-out sessions and clean them up)
- Persistence writes (batch-write accumulated session updates to disk every 30 seconds)
- Health checks (ping each agent every minute to ensure it’s alive)
These are implemented using Node.js timer functions or a cron library. The Gateway has a scheduler component that runs these tasks on a fixed interval.
Systemd/Service Management
In production, the Gateway itself runs as a systemd service (on Linux) or an equivalent service manager (macOS, Windows). The service configuration ensures the Gateway starts on boot, restarts if it crashes, and logs output for debugging.
A typical systemd service file looks like:
[Service]
Type=simple
ExecStart=/usr/bin/node /path/to/gateway/index.js
Restart=always
RestartSec=10This says: “Start the Gateway process. If it crashes, wait 10 seconds and restart it automatically.”
Example: Daily Summary Cron Job
Here’s how a daily summary cron job might work end-to-end:
- Agent is configured with a cron expression: “every day at 9:00 AM”
- Agent’s internal scheduler watches for this time
- At 9 AM, the agent generates a summary of the day’s activities
- The agent calls the Gateway’s message API: “Send this summary to user 7123566776 on Telegram”
- The Gateway creates a session entry for this message
- The Gateway routes the message to the Telegram Channel
- The Telegram Channel formats it and sends it to Telegram’s API
- Telegram delivers the message to the user
All of this happens automatically, without user input. The cron job is transparent to the user—they just receive the message at the scheduled time.
Data Flow: A Complete Example
Let’s trace a complete message through the system to see how all components interact.
Scenario: It’s 10:30 AM. A user sends a message in Telegram: “Generate a report.”
Step 1: Message Ingestion (Telegram Channel)
Telegram API sends a webhook to the Gateway:
Incoming Telegram Webhook
User ID: 7123566776
Message Text: "Generate a report"
Chat ID: 7123566776
Timestamp: 1703275200The Telegram Channel Adapter receives this webhook, extracts the essential data, and creates an internal message object:
{
channel: "telegram",
user_id: "7123566776",
text: "Generate a report",
media: null,
timestamp: 1703275200
}The adapter passes this to the Gateway’s message handler.
Step 2: Session Routing (Gateway + Session Manager)
The Gateway asks the Session Manager: “Is there a session for (telegram, 7123566776)?”
The Session Manager checks its in-memory store. In this case, yes—a session exists with ID sess_abc123. The session has history from previous messages. The Session Manager retrieves the session, resets its timeout, and appends the new message to its history:
Session sess_abc123 now contains:
Message history:
[earlier messages...]
User (9:00 AM): "Hi there"
Assistant (9:01 AM): "Hello! How can I help?"
User (10:30 AM): "Generate a report"Step 3: Agent Execution (Gateway ↔ Agent)
The Gateway makes an HTTP POST request to the agent:
POST http://localhost:9000/rpc/message
{
"sessionId": "sess_abc123",
"messages": [...all history...],
"stream": true
}The agent receives this, processes it with its LLM, decides it needs to gather data, calls tools (like fetching files, querying databases), and generates a report. It streams back the response token-by-token.
The Gateway collects the streamed response:
"I've generated your daily report. Here are the key metrics..."
[continues streaming]Step 4: Response Routing (Gateway ↔ Telegram Channel)
Once the agent finishes, the Gateway has the complete response. It calls the Telegram Channel adapter:
Channel.send(
user_id: "7123566776",
message: "I've generated your daily report...",
parse_mode: "markdown"
)The Telegram Channel adapter formats this for Telegram’s API and makes an HTTP call to Telegram:
POST https://api.telegram.org/botTOKEN/sendMessage
{
"chat_id": 7123566776,
"text": "I've generated your daily report...",
"parse_mode": "markdown"
}Telegram receives this and delivers the message to the user.
Step 5: State Persistence (Gateway ↔ Persistence)
The Gateway updates the session in persistent storage:
SQLite write:
UPDATE sessions
SET messages = [..., {from: assistant, content: "I've generated..."}]
WHERE id = "sess_abc123"The message and response are now part of permanent history. If the Gateway crashes, restarts, and the user asks a follow-up question, the agent can still reference this report they requested.
Step 6: Timeout Reset (Session Manager)
The Session Manager resets the session’s timeout to 15 minutes from now. If another message arrives within 15 minutes, the session is reused. If 15 minutes pass with no activity, the session is cleaned up (but its history remains in persistent storage).
Architecture Strengths and Trade-offs
Why This Architecture Works
The component-based design is simple and clear:
Each component has a single responsibility. The Gateway routes. Channels translate. The Agent responds. This separation makes the system easy to understand and modify.
Everything runs on a single machine. The Gateway and channels run as one process. The agent runs as a separate process on the same computer. This keeps setup simple—no network configuration, no distributed system complexity. You just start the Gateway and the agent, and they talk to each other via localhost HTTP.
The system is language-flexible. You can use Node.js for the Gateway and channels, Python for a custom agent, or any other language—as long as the agent exposes an HTTP endpoint.
Trade-offs
Running everything on your hardware means you bear the cost. The Gateway uses memory for session storage, the agent uses CPU. On a Raspberry Pi, this is tight. On a VPS, it’s fine. On a personal laptop, you’re limited to a handful of concurrent users.
The HTTP-based agent communication introduces latency compared to in-process calls. A round-trip to an agent takes milliseconds. For most use cases, this is fine. For ultra-low-latency scenarios, it’s a constraint.
Persistence is eventual-consistent. If the Gateway crashes mid-write, you might lose a few messages. For most users, this is acceptable. For compliance-heavy scenarios, you might need more robust persistence (with transaction logs, etc.).
Takeaways
OpenClaw is a single-machine system designed for clarity and simplicity:
- The Gateway is Node.js: manages all channels, routes messages, handles sessions.
- Channels are Node.js adapters: translate between messaging platforms and OpenClaw’s internal format.
- The Agent is language-agnostic: communicates with the Gateway via HTTP, runs on the same machine.
- Sessions are stateful: remember conversation history within a single machine.
- Persistence is optional: save what you need to disk, restore on demand.
OpenClaw runs on one computer—your laptop, a Raspberry Pi, or a VPS. Everything communicates via localhost. There’s no distributed clustering, no load balancing, no failover. You get simplicity and control instead of scale.
The next time you send a message to OpenClaw, you now know exactly what’s happening behind the scenes: a Gateway routing, Channels translating, an Agent thinking, Sessions remembering, and Persistence saving it all—all on the same machine.
Next up: Building a custom agent and deploying it alongside the Gateway.