chore: Run pnpm format:fix.

2026-06-29 09:02:02 +03:00 · 2026-01-31 21:13:13 +09:00
parent dcc2de15a6
commit 8cab78abbc
624 changed files with 10729 additions and 7514 deletions
@@ -3,28 +3,31 @@ summary: "How inbound audio/voice notes are downloaded, transcribed, and injecte
 read_when:
  - Changing audio transcription or media handling
 ---
+
 # Audio / Voice Notes — 2026-01-17

 ## What works
+
 - **Media understanding (audio)**: If audio understanding is enabled (or auto‑detected), OpenClaw:
-  1) Locates the first audio attachment (local path or URL) and downloads it if needed.
-  2) Enforces `maxBytes` before sending to each model entry.
-  3) Runs the first eligible model entry in order (provider or CLI).
-  4) If it fails or skips (size/timeout), it tries the next entry.
-  5) On success, it replaces `Body` with an `[Audio]` block and sets `{{Transcript}}`.
+  1. Locates the first audio attachment (local path or URL) and downloads it if needed.
+  2. Enforces `maxBytes` before sending to each model entry.
+  3. Runs the first eligible model entry in order (provider or CLI).
+  4. If it fails or skips (size/timeout), it tries the next entry.
+  5. On success, it replaces `Body` with an `[Audio]` block and sets `{{Transcript}}`.
 - **Command parsing**: When transcription succeeds, `CommandBody`/`RawBody` are set to the transcript so slash commands still work.
 - **Verbose logging**: In `--verbose`, we log when transcription runs and when it replaces the body.

 ## Auto-detection (default)
+
 If you **don’t configure models** and `tools.media.audio.enabled` is **not** set to `false`,
 OpenClaw auto-detects in this order and stops at the first working option:

-1) **Local CLIs** (if installed)
+1. **Local CLIs** (if installed)
   - `sherpa-onnx-offline` (requires `SHERPA_ONNX_MODEL_DIR` with encoder/decoder/joiner/tokens)
   - `whisper-cli` (from `whisper-cpp`; uses `WHISPER_CPP_MODEL` or the bundled tiny model)
   - `whisper` (Python CLI; downloads models automatically)
-2) **Gemini CLI** (`gemini`) using `read_many_files`
-3) **Provider keys** (OpenAI → Groq → Deepgram → Google)
+2. **Gemini CLI** (`gemini`) using `read_many_files`
+3. **Provider keys** (OpenAI → Groq → Deepgram → Google)

 To disable auto-detection, set `tools.media.audio.enabled: false`.
 To customize, set `tools.media.audio.models`.
@@ -33,6 +36,7 @@ Note: Binary detection is best-effort across macOS/Linux/Windows; ensure the CLI
 ## Config examples

 ### Provider + CLI fallback (OpenAI + Whisper CLI)
+
 ```json5
 {
  tools: {
@@ -46,16 +50,17 @@ Note: Binary detection is best-effort across macOS/Linux/Windows; ensure the CLI
            type: "cli",
            command: "whisper",
            args: ["--model", "base", "{{MediaPath}}"],
-            timeoutSeconds: 45
-          }
-        ]
-      }
-    }
-  }
+            timeoutSeconds: 45,
+          },
+        ],
+      },
+    },
+  },
 }
 ```

 ### Provider-only with scope gating
+
 ```json5
 {
  tools: {
@@ -64,34 +69,32 @@ Note: Binary detection is best-effort across macOS/Linux/Windows; ensure the CLI
        enabled: true,
        scope: {
          default: "allow",
-          rules: [
-            { action: "deny", match: { chatType: "group" } }
-          ]
+          rules: [{ action: "deny", match: { chatType: "group" } }],
        },
-        models: [
-          { provider: "openai", model: "gpt-4o-mini-transcribe" }
-        ]
-      }
-    }
-  }
+        models: [{ provider: "openai", model: "gpt-4o-mini-transcribe" }],
+      },
+    },
+  },
 }
 ```

 ### Provider-only (Deepgram)
+
 ```json5
 {
  tools: {
    media: {
      audio: {
        enabled: true,
-        models: [{ provider: "deepgram", model: "nova-3" }]
-      }
-    }
-  }
+        models: [{ provider: "deepgram", model: "nova-3" }],
+      },
+    },
+  },
 }
 ```

 ## Notes & limits
+
 - Provider auth follows the standard model auth order (auth profiles, env vars, `models.providers.*.apiKey`).
 - Deepgram picks up `DEEPGRAM_API_KEY` when `provider: "deepgram"` is used.
 - Deepgram setup details: [Deepgram (audio transcription)](/providers/deepgram).
@@ -104,6 +107,7 @@ Note: Binary detection is best-effort across macOS/Linux/Windows; ensure the CLI
 - CLI stdout is capped (5MB); keep CLI output concise.

 ## Gotchas
+
 - Scope rules use first-match wins. `chatType` is normalized to `direct`, `group`, or `room`.
 - Ensure your CLI exits 0 and prints plain text; JSON needs to be massaged via `jq -r .text`.
 - Keep timeouts reasonable (`timeoutSeconds`, default 60s) to avoid blocking the reply queue.