MMNTM logo
Technical Deep Dive

The Architecture of Clawdbot: Building Personal AI Infrastructure

A technical deep-dive into the engineering decisions behind an open-source personal AI assistant that runs locally and speaks 29+ messaging protocols.

Casey
12 min read
#AI Agents#Architecture#Open Source#TypeScript#Local-First
The Architecture of Clawdbot: Building Personal AI Infrastructure

The Problem With Cloud AI Assistants

Every major AI assistant follows the same playbook: your messages go to a cloud server, get processed, and responses come back. You're a tenant in someone else's infrastructure, subject to their rate limits, their privacy policies, their uptime.

Clawdbot inverts this model entirely. It's a local-first gateway that runs on your machine, connects to your messaging accounts, and routes conversations to AI agents under your control. The architecture treats messaging platforms as interchangeable protocols and AI models as swappable backends—what remains constant is your infrastructure, your data, your rules.

This isn't a weekend hack. The codebase spans 40,000+ lines of TypeScript, supports 29 messaging channels through a plugin system, and implements patterns you rarely see outside distributed systems: lane-based concurrency, cascading route resolution, cross-channel identity linking, and human-in-the-loop approval gating.

Let's examine what makes it work.


1. Lane-Based Concurrency: Preventing Starvation by Design

Most async systems use a single priority queue. High-priority tasks run first; low-priority tasks wait. The problem: a burst of medium-priority work can starve everything below it indefinitely.

Clawdbot takes a different approach. Work is partitioned into lanes—orthogonal queues that operate independently with separate concurrency limits.

From src/process/lanes.ts:

export const enum CommandLane {
  Main = "main",        // Primary chat workflow
  Cron = "cron",        // Scheduled jobs
  Subagent = "subagent", // Child agent spawning
  Nested = "nested",    // Nested tool calls
}

Each lane maintains its own queue and active count. The implementation in src/process/command-queue.ts is elegant:

type LaneState = {
  lane: string;
  queue: QueueEntry[];
  active: number;
  maxConcurrent: number;
  draining: boolean;
};
 
const lanes = new Map<string, LaneState>();
 
export function enqueueCommandInLane<T>(
  lane: string,
  task: () => Promise<T>,
  opts?: { warnAfterMs?: number; onWait?: (waitMs: number, queuedAhead: number) => void },
): Promise<T> {
  const state = getLaneState(lane);
  return new Promise<T>((resolve, reject) => {
    state.queue.push({
      task: () => task(),
      resolve: (value) => resolve(value as T),
      reject,
      enqueuedAt: Date.now(),
      warnAfterMs: opts?.warnAfterMs ?? 2_000,
      onWait: opts?.onWait,
    });
    drainLane(lane);
  });
}

The drain function pumps tasks until the lane hits its concurrency limit:

function drainLane(lane: string) {
  const state = getLaneState(lane);
  if (state.draining) return;
  state.draining = true;
 
  const pump = () => {
    while (state.active < state.maxConcurrent && state.queue.length > 0) {
      const entry = state.queue.shift()!;
      state.active += 1;
 
      void (async () => {
        try {
          const result = await entry.task();
          state.active -= 1;
          pump();  // Recursive drain
          entry.resolve(result);
        } catch (err) {
          state.active -= 1;
          pump();
          entry.reject(err);
        }
      })();
    }
    state.draining = false;
  };
  pump();
}

The gateway configures these limits from user settings in src/gateway/server-lanes.ts:

export function applyGatewayLaneConcurrency(cfg: ReturnType<typeof loadConfig>) {
  setCommandLaneConcurrency(CommandLane.Cron, cfg.cron?.maxConcurrentRuns ?? 1);
  setCommandLaneConcurrency(CommandLane.Main, resolveAgentMaxConcurrent(cfg));
  setCommandLaneConcurrency(CommandLane.Subagent, resolveSubagentMaxConcurrent(cfg));
}

A scheduled email digest running in the Cron lane cannot block incoming WhatsApp messages in the Main lane. Subagent spawning has its own budget. Lanes don't compete—they coexist. This is starvation-free by construction, not by tuning.


2. The Channel Plugin System: Protocol as Commodity

Clawdbot supports WhatsApp, Telegram, Signal, Discord, Slack, iMessage, Matrix, Microsoft Teams, LINE, Nostr, Google Chat, Twitch, Mattermost, and more. Each platform has radically different APIs, authentication flows, message formats, and capabilities.

Rather than building monolithic handlers, the codebase defines a plugin contract that normalizes all channels. From src/channels/plugins/types.plugin.ts:

export type ChannelPlugin<ResolvedAccount = any> = {
  id: ChannelId;
  meta: ChannelMeta;
  capabilities: ChannelCapabilities;
 
  // Authentication & setup
  config: ChannelConfigAdapter<ResolvedAccount>;
  setup?: ChannelSetupAdapter;
  auth?: ChannelAuthAdapter;
 
  // Security policies
  pairing?: ChannelPairingAdapter;
  security?: ChannelSecurityAdapter<ResolvedAccount>;
 
  // Messaging primitives
  outbound?: ChannelOutboundAdapter;
  messaging?: ChannelMessagingAdapter;
  streaming?: ChannelStreamingAdapter;
  threading?: ChannelThreadingAdapter;
  actions?: ChannelMessageActionAdapter;
 
  // Gateway integration
  gateway?: ChannelGatewayAdapter<ResolvedAccount>;
 
  // Agent tools (channel-specific capabilities)
  agentTools?: ChannelAgentToolFactory | ChannelAgentTool[];
};

Each adapter is optional—channels implement what they support. The capabilities declaration tells the system what's available:

export type ChannelCapabilities = {
  chatTypes: Array<NormalizedChatType | "thread">;
  polls?: boolean;
  reactions?: boolean;
  edit?: boolean;
  unsend?: boolean;
  reply?: boolean;
  threads?: boolean;
  media?: boolean;
  blockStreaming?: boolean;
};

Adding a new channel requires minimal boilerplate. Here's the complete Matrix extension from extensions/matrix/index.ts:

import type { ClawdbotPluginApi } from "clawdbot/plugin-sdk";
import { emptyPluginConfigSchema } from "clawdbot/plugin-sdk";
import { matrixPlugin } from "./src/channel.js";
import { setMatrixRuntime } from "./src/runtime.js";
 
const plugin = {
  id: "matrix",
  name: "Matrix",
  description: "Matrix channel plugin (matrix-js-sdk)",
  configSchema: emptyPluginConfigSchema(),
  register(api: ClawdbotPluginApi) {
    setMatrixRuntime(api.runtime);
    api.registerChannel({ plugin: matrixPlugin });
  },
};
 
export default plugin;

The plugin SDK exports 100+ types and utilities, including channel-specific helpers for normalizing targets, resolving accounts, and handling onboarding—patterns extracted from production channels that extension authors can reuse.

Messaging protocols become commodities. The gateway doesn't care if a message came from WhatsApp or Matrix or Nostr. It cares about the normalized event, the resolved route, and the agent that should handle it.


3. The Routing Cascade: From Message to Agent

When a message arrives, Clawdbot must decide which agent handles it. The routing system implements a cascade of matching strategies with precise precedence.

From src/routing/resolve-route.ts:

export type ResolvedAgentRoute = {
  agentId: string;
  channel: string;
  accountId: string;
  sessionKey: string;
  mainSessionKey: string;
  matchedBy:
    | "binding.peer"      // Specific sender matched
    | "binding.guild"     // Discord server matched
    | "binding.team"      // MS Teams team matched
    | "binding.account"   // Account-level binding
    | "binding.channel"   // Channel-wide wildcard
    | "default";          // Fallback agent
};

The resolution function filters bindings by channel and account, then applies matches in order:

export function resolveAgentRoute(input: ResolveAgentRouteInput): ResolvedAgentRoute {
  const bindings = listBindings(input.cfg).filter((binding) => {
    if (!matchesChannel(binding.match, channel)) return false;
    return matchesAccountId(binding.match?.accountId, accountId);
  });
 
  // 1. Peer match (most specific)
  if (peer) {
    const peerMatch = bindings.find((b) => matchesPeer(b.match, peer));
    if (peerMatch) return choose(peerMatch.agentId, "binding.peer");
  }
 
  // 2. Guild match (Discord servers)
  if (guildId) {
    const guildMatch = bindings.find((b) => matchesGuild(b.match, guildId));
    if (guildMatch) return choose(guildMatch.agentId, "binding.guild");
  }
 
  // 3. Team match (MS Teams)
  if (teamId) {
    const teamMatch = bindings.find((b) => matchesTeam(b.match, teamId));
    if (teamMatch) return choose(teamMatch.agentId, "binding.team");
  }
 
  // 4. Account-level fallback
  const accountMatch = bindings.find((b) =>
    b.match?.accountId?.trim() !== "*" &&
    !b.match?.peer && !b.match?.guildId && !b.match?.teamId
  );
  if (accountMatch) return choose(accountMatch.agentId, "binding.account");
 
  // 5. Channel wildcard
  const anyAccountMatch = bindings.find((b) =>
    b.match?.accountId?.trim() === "*" &&
    !b.match?.peer && !b.match?.guildId && !b.match?.teamId
  );
  if (anyAccountMatch) return choose(anyAccountMatch.agentId, "binding.channel");
 
  // 6. Default
  return choose(resolveDefaultAgentId(input.cfg), "default");
}

Session Keys and Identity Linking

Session continuity is managed through structured keys. From src/routing/session-key.ts:

export function buildAgentPeerSessionKey(params: {
  agentId: string;
  channel: string;
  peerKind?: "dm" | "group" | "channel" | null;
  peerId?: string | null;
  identityLinks?: Record<string, string[]>;
  dmScope?: "main" | "per-peer" | "per-channel-peer";
}): string {
  const peerKind = params.peerKind ?? "dm";
 
  if (peerKind === "dm") {
    const dmScope = params.dmScope ?? "main";
    let peerId = (params.peerId ?? "").trim();
 
    // Resolve cross-channel identity links
    const linkedPeerId = dmScope === "main" ? null : resolveLinkedPeerId({
      identityLinks: params.identityLinks,
      channel: params.channel,
      peerId,
    });
    if (linkedPeerId) peerId = linkedPeerId;
 
    if (dmScope === "per-channel-peer" && peerId) {
      return `agent:\${normalizeAgentId(params.agentId)}:\${channel}:dm:\${peerId}`;
    }
    if (dmScope === "per-peer" && peerId) {
      return `agent:\${normalizeAgentId(params.agentId)}:dm:\${peerId}`;
    }
    return buildAgentMainSessionKey({ agentId: params.agentId });
  }
 
  return `agent:\${normalizeAgentId(params.agentId)}:\${channel}:\${peerKind}:\${peerId}`;
}

The identityLinks configuration allows mapping a single person across channels:

session:
  dmScope: per-peer
  identityLinks:
    alice:
      - "whatsapp:+15551234567"
      - "telegram:alice_smith"
      - "signal:+15551234567"

Now conversations with Alice share context whether she messages via WhatsApp, Telegram, or Signal.

The session key is a canonical address for conversation state. By structuring it as agent:{id}:{scope}:{peer}, the system achieves both isolation (different agents, different contexts) and continuity (same person across channels).


4. The Gateway: 84 Methods for Personal Infrastructure

The gateway server exposes 84+ RPC methods over WebSocket. From src/gateway/server-methods-list.ts:

const BASE_METHODS = [
  // Health & status
  "health", "status", "channels.status",
 
  // Configuration
  "config.get", "config.set", "config.apply", "config.patch",
 
  // Execution approval
  "exec.approval.request", "exec.approval.resolve",
 
  // Sessions
  "sessions.list", "sessions.preview", "sessions.reset", "sessions.compact",
 
  // Agents
  "agent", "agents.list", "agent.identity.get", "agent.wait",
 
  // Nodes (mobile devices)
  "node.pair.request", "node.pair.approve", "node.list", "node.invoke",
 
  // Cron
  "cron.list", "cron.add", "cron.run", "cron.runs",
 
  // Models & TTS
  "models.list", "tts.providers", "tts.convert",
 
  // Messaging
  "send", "chat.send", "chat.history", "chat.abort",
  // ...
];

And 11 event types for real-time updates:

export const GATEWAY_EVENTS = [
  "connect.challenge",
  "agent",
  "chat",
  "presence",
  "shutdown",
  "exec.approval.requested",
  "exec.approval.resolved",
  "node.pair.requested",
  "voicewake.changed",
  // ...
];

This is the API surface of personal infrastructure. Mobile apps, desktop clients, and CLI tools all speak this protocol. The gateway maintains channel connections, executes scheduled tasks, manages approval workflows, and coordinates across devices—all locally.


5. Execution Approval: Human-in-the-Loop by Default

When an AI agent wants to run a shell command or modify files, the system gates dangerous operations through human approval. From src/gateway/exec-approval-manager.ts:

export type ExecApprovalRequestPayload = {
  command: string;
  cwd?: string | null;
  host?: string | null;
  security?: string | null;
  ask?: string | null;
  agentId?: string | null;
  sessionKey?: string | null;
};
 
export class ExecApprovalManager {
  private pending = new Map<string, PendingEntry>();
 
  async waitForDecision(
    record: ExecApprovalRecord,
    timeoutMs: number,
  ): Promise<ExecApprovalDecision | null> {
    return new Promise<ExecApprovalDecision | null>((resolve, reject) => {
      const timer = setTimeout(() => {
        this.pending.delete(record.id);
        resolve(null);  // Timeout = denied
      }, timeoutMs);
 
      this.pending.set(record.id, { record, resolve, reject, timer });
    });
  }
 
  resolve(
    recordId: string,
    decision: ExecApprovalDecision,
    resolvedBy?: string | null
  ): boolean {
    const pending = this.pending.get(recordId);
    if (!pending) return false;
 
    clearTimeout(pending.timer);
    pending.record.decision = decision;
    pending.record.resolvedBy = resolvedBy ?? null;
    this.pending.delete(recordId);
    pending.resolve(decision);
    return true;
  }
}

Approval requests propagate to connected nodes (iOS/Android apps) via the exec.approval.requested event. You get a push notification: "Agent 'work' wants to run git push origin main"—with full context about which session triggered it.

AI agents operate in a trust hierarchy. The human remains the final authority for consequential actions, but the system handles the mechanics of request routing, timeout enforcement, and decision propagation.


6. Media Handling: Cross-Platform Normalization

Each messaging platform has different media limits (WhatsApp: 16MB, Telegram: 50MB), format support, and URL handling. The media store in src/media/store.ts normalizes this:

const MAX_BYTES = 5 * 1024 * 1024;  // 5MB default
const DEFAULT_TTL_MS = 2 * 60 * 1000;  // 2 minutes
 
/**
 * Sanitize filename for cross-platform safety.
 * Removes chars unsafe on Windows/SharePoint/all platforms.
 */
function sanitizeFilename(name: string): string {
  const unsafe = /[<>:"/\|?*\x00-\x1f]/g;
  return name
    .trim()
    .replace(unsafe, "_")
    .replace(/\s+/g, "_")
    .replace(/_+/g, "_")
    .replace(/^_|_$/g, "")
    .slice(0, 60);
}
 
/**
 * Extract original filename from embedded UUID pattern.
 * {original}---{uuid}.{ext} → {original}.{ext}
 */
export function extractOriginalFilename(filePath: string): string {
  const basename = path.basename(filePath);
  if (!basename) return "file.bin";
 
  const ext = path.extname(basename);
  const nameWithoutExt = path.basename(basename, ext);
 
  const match = nameWithoutExt.match(
    /^(.+)---[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}$/i
  );
 
  return match?.[1] ? `\${match[1]}\${ext}` : basename;
}
 
// Auto-cleanup expired media
export async function cleanOldMedia(ttlMs = DEFAULT_TTL_MS) {
  const mediaDir = await ensureMediaDir();
  const entries = await fs.readdir(mediaDir).catch(() => []);
  const now = Date.now();
 
  await Promise.all(entries.map(async (file) => {
    const full = path.join(mediaDir, file);
    const stat = await fs.stat(full).catch(() => null);
    if (stat && now - stat.mtimeMs > ttlMs) {
      await fs.rm(full).catch(() => {});
    }
  }));
}

Temporary files are garbage-collected automatically. The UUID-embedding pattern preserves original filenames while ensuring uniqueness. And the sanitization handles the intersection of what Windows, macOS, Linux, and various cloud services consider safe.


The Design Philosophy

Reading through Clawdbot's architecture, a consistent philosophy emerges:

  1. Local-first, not local-only. Data lives under ~/.clawdbot/, but the gateway can expose itself via Tailscale or mDNS when you want remote access.

  2. Protocols are plugins. WhatsApp, Matrix, Nostr—they're all just implementations of ChannelPlugin. The gateway doesn't privilege any platform.

  3. Isolation by construction. Lanes prevent work categories from interfering. Session keys prevent conversations from bleeding. Approval gating prevents agents from acting unilaterally.

  4. Human authority is preserved. The AI does the work; the human approves the consequences. This isn't a limitation—it's the point.

For developers building AI assistants, Clawdbot offers an alternative to the cloud-tenant model. Your infrastructure, your protocols, your rules—with engineering rigor that takes these constraints seriously.


Casey has a great essay on what this architecture means beyond the code. For the deeper philosophical take, see The Sovereign Agent on Texxr.

CaseyJan 26, 2026
Building Clawdbot: Engineering Personal AI