chore: Run pnpm format:fix.

2026-06-28 13:01:42 +03:00 · 2026-01-31 21:13:13 +09:00
parent dcc2de15a6
commit 8cab78abbc
624 changed files with 10729 additions and 7514 deletions
@@ -3,15 +3,18 @@ summary: "Command queue design that serializes inbound auto-reply runs"
 read_when:
  - Changing auto-reply execution or concurrency
 ---
+
 # Command Queue (2026-01-16)

 We serialize inbound auto-reply runs (all channels) through a tiny in-process queue to prevent multiple agent runs from colliding, while still allowing safe parallelism across sessions.

 ## Why
+
 - Auto-reply runs can be expensive (LLM calls) and can collide when multiple inbound messages arrive close together.
 - Serializing avoids competing for shared resources (session files, logs, CLI stdin) and reduces the chance of upstream rate limits.

 ## How it works
+
 - A lane-aware FIFO queue drains each lane with a configurable concurrency cap (default 1 for unconfigured lanes; main defaults to 4, subagent to 8).
 - `runEmbeddedPiAgent` enqueues by **session key** (lane `session:<key>`) to guarantee only one active run per session.
 - Each session run is then queued into a **global lane** (`main` by default) so overall parallelism is capped by `agents.defaults.maxConcurrent`.
@@ -19,7 +22,9 @@ We serialize inbound auto-reply runs (all channels) through a tiny in-process qu
 - Typing indicators still fire immediately on enqueue (when supported by the channel) so user experience is unchanged while we wait our turn.

 ## Queue modes (per channel)
+
 Inbound messages can steer the current run, wait for a followup turn, or do both:
+
 - `steer`: inject immediately into the current run (cancels pending tool calls after the next tool boundary). If not streaming, falls back to followup.
 - `followup`: enqueue for the next agent turn after the current run ends.
 - `collect`: coalesce all queued messages into a **single** followup turn (default). If messages target different channels/threads, they drain individually to preserve routing.
@@ -33,6 +38,7 @@ one response per inbound message.
 Send `/queue collect` as a standalone command (per-session) or set `messages.queue.byChannel.discord: "collect"`.

 Defaults (when unset in config):
+
 - All surfaces → `collect`

 Configure globally or per channel via `messages.queue`:
@@ -45,14 +51,16 @@ Configure globally or per channel via `messages.queue`:
      debounceMs: 1000,
      cap: 20,
      drop: "summarize",
-      byChannel: { discord: "collect" }
-    }
-  }
+      byChannel: { discord: "collect" },
+    },
+  },
 }
 ```

 ## Queue options
+
 Options apply to `followup`, `collect`, and `steer-backlog` (and to `steer` when it falls back to followup):
+
 - `debounceMs`: wait for quiet before starting a followup turn (prevents “continue, continue”).
 - `cap`: max queued messages per session.
 - `drop`: overflow policy (`old`, `new`, `summarize`).
@@ -61,11 +69,13 @@ Summarize keeps a short bullet list of dropped messages and injects it as a synt
 Defaults: `debounceMs: 1000`, `cap: 20`, `drop: summarize`.

 ## Per-session overrides
+
 - Send `/queue <mode>` as a standalone command to store the mode for the current session.
 - Options can be combined: `/queue collect debounce:2s cap:25 drop:summarize`
 - `/queue default` or `/queue reset` clears the session override.

 ## Scope and guarantees
+
 - Applies to auto-reply agent runs across all inbound channels that use the gateway reply pipeline (WhatsApp web, Telegram, Slack, Discord, Signal, iMessage, webchat, etc.).
 - Default lane (`main`) is process-wide for inbound + main heartbeats; set `agents.defaults.maxConcurrent` to allow multiple sessions in parallel.
 - Additional lanes may exist (e.g. `cron`, `subagent`) so background jobs can run in parallel without blocking inbound replies.
@@ -73,5 +83,6 @@ Defaults: `debounceMs: 1000`, `cap: 20`, `drop: summarize`.
 - No external dependencies or background worker threads; pure TypeScript + promises.

 ## Troubleshooting
+
 - If commands seem stuck, enable verbose logs and look for “queued for …ms” lines to confirm the queue is draining.
 - If you need queue depth, enable verbose logs and watch for queue timing lines.