From: Jérôme Benoit Date: Wed, 10 Jun 2026 22:24:31 +0000 (+0200) Subject: chore(skills): sync qmd skill to upstream tobi/qmd@main X-Git-Url: https://git.piment-noir.org/?a=commitdiff_plain;h=aa4085b0d5ce06881f46773cb36417ec86d3537a;p=e-mobility-charging-stations-simulator.git chore(skills): sync qmd skill to upstream tobi/qmd@main Update .agents/skills/qmd/SKILL.md to upstream version 2.2.0 (blob 0d4b04882506d86d36eca291f1d91627d195dadb). Replaces the Nix OpenClaw downstream variant (v2.0.0) bundled by nix-openclaw-tools with the canonical upstream skill content. Refs: https://github.com/tobi/qmd/issues/722 --- diff --git a/.agents/skills/qmd/SKILL.md b/.agents/skills/qmd/SKILL.md index 1f5e3e68..6a889dfb 100644 --- a/.agents/skills/qmd/SKILL.md +++ b/.agents/skills/qmd/SKILL.md @@ -1,146 +1,298 @@ --- name: qmd -description: Search markdown knowledge bases, notes, and documentation using QMD. Use when users ask to search notes, find documents, or look up information. +description: Search local markdown knowledge bases, notes, docs, and wikis with QMD. Use when users ask to find notes, retrieve documents, inspect a wiki, answer from indexed markdown, or set up QMD access. license: MIT compatibility: Requires qmd CLI or MCP server. Install via `npm install -g @tobilu/qmd`. metadata: author: tobi - version: '2.0.0' + version: '2.2.0' allowed-tools: Bash(qmd:*), mcp__qmd__* --- -# QMD - Quick Markdown Search +# QMD - Query Markdown Documents -Local search engine for markdown content. +## How search works -## Status +QMD searches local markdown collections: notes, docs, wikis, transcripts, and +project knowledge bases. Use it before web search when the answer may already be +in indexed local files. -!`qmd status 2>/dev/null || echo "Not installed: npm install -g @tobilu/qmd"` +The workflow is always: -## MCP: `query` +1. Search for candidate documents. +2. Retrieve the full source with `qmd get` or `qmd multi-get`. +3. Answer from retrieved text, citing paths or docids. -```json -{ - "searches": [ - { "type": "lex", "query": "CAP theorem consistency" }, - { "type": "vec", "query": "tradeoff between consistency and availability" } - ], - "collections": ["docs"], - "limit": 10 -} +Do not answer from snippets alone when the user needs facts, decisions, quotes, +or nuance. Snippets are only leads. + +Typical loop: + +```bash +qmd search "merchant reality support interviews" -n 5 +# leads: #abc123 concepts/customer-proximity.md; #def432 sources/merchant-call.md +qmd multi-get "#abc123,#def432" --format md ``` -### Query Types +**Default to structured `qmd query` with `intent:`, `lex:`, `vec:`, and `hyde:` +fields that you write yourself.** You are a better query expander than the +built-in model: you know the user's actual goal, the domain vocabulary, and the +nearby-but-wrong concepts to avoid. Do not just paste the user's words into +`qmd query "..."` and hope the expansion model guesses right — supply the +`intent:` and craft the lexical and semantic terms deliberately (see +[Pick the right search mode](#pick-the-right-search-mode)). + +When reporting what you retrieved, a compact note is enough; do not paste whole +files unless needed: + +```text +Retrieved: +- #abc123 concepts/customer-proximity.md +- #def432 sources/merchant-call.md +``` -| Type | Method | Input | -| ------ | ------ | ------------------------------------------- | -| `lex` | BM25 | Keywords — exact terms, names, code | -| `vec` | Vector | Question — natural language | -| `hyde` | Vector | Answer — hypothetical result (50-100 words) | +## Pick the right search mode -### Writing Good Queries +Use **BM25 lexical search** when you know exact words, titles, names, code +symbols, or rare phrases: -**lex (keyword)** +```bash +qmd search "cockpit OKR Goodhart" -n 10 +qmd search '"AI Before Headcount"' -c concepts -n 5 +``` -- 2-5 terms, no filler words -- Exact phrase: `"connection pool"` (quoted) -- Exclude terms: `performance -sports` (minus prefix) -- Code identifiers work: `handleError async` +Use **`qmd query` with structured fields** when the user describes an idea +indirectly, uses different wording than the source, or needs conceptual recall. +**This is the default mode — write the fields yourself rather than leaning on +query expansion.** Combine exact anchors with semantic recall: -**vec (semantic)** +```bash +qmd query $'intent: Find the concept note about metrics as instruments without letting OKRs replace judgment.\nlex: cockpit instruments OKR Goodhart metrics judgment\nvec: data informed not metric driven product judgment\nhyde: A concept note says metrics are useful like cockpit instruments, but leaders should remain data-informed rather than metric-driven because OKRs and dashboards can Goodhart product judgment.' +``` -- Full natural language question -- Be specific: `"how does the rate limiter handle burst traffic"` -- Include context: `"in the payment service, how are refunds processed"` +Structured query fields (you author each one — do not delegate this to the +expansion model): -**hyde (hypothetical document)** +- `intent:` states what you are trying to find **and what to avoid**. Always + supply this. It steers ranking away from nearby-but-wrong concepts. +- `lex:` exact terms, aliases, titles, code symbols, and rare words you expect + in the source. This is your own keyword expansion. +- `vec:` paraphrases the idea in natural language, in source-like wording. +- `hyde:` describes the document or answer that would satisfy the request. -- Write 50-100 words of what the _answer_ looks like -- Use the vocabulary you expect in the result +You do not need all four every time, but you should almost always write at least +`intent:` plus one of `lex:`/`vec:`. A bare `qmd query "the user's sentence"` +throws away the context only you have and relies on the built-in expander to +reconstruct it — prefer the structured form. -**expand (auto-expand)** +If you genuinely have nothing to expand (a single rare token, a verbatim phrase), +that is a job for `qmd search`, not bare `qmd query`: -- Use a single-line query (implicit) or `expand: question` on its own line -- Lets the local LLM generate lex/vec/hyde variations -- Do not mix `expand:` with other typed lines — it's either a standalone expand query or a full query document +```bash +qmd query --format json --explain $'intent: ...\nlex: ...\nvec: ...' # inspect ranking +``` -### Intent (Disambiguation) +If `qmd query` is slow or model/GPU setup fails, fall back to `qmd search` with +better lexical terms. -When a query term is ambiguous, add `intent` to steer results: +## Retrieve sources -```json -{ - "searches": [{ "type": "lex", "query": "performance" }], - "intent": "web page load times and Core Web Vitals" -} +Search results include docids like `#abc123` and `qmd://...` paths. Fetch them: + +```bash +qmd get "#abc123" +qmd get qmd://concepts/ai-before-headcount.md +qmd multi-get "#abc123,#def432" --format md +qmd multi-get 'concepts/{ai-before-headcount.md,data-informed-not-metric-driven.md}' --format md +qmd multi-get 'sources/podcast-2025-*.md' -l 80 ``` -Intent affects expansion, reranking, chunk selection, and snippet extraction. It does not search on its own — it's a steering signal that disambiguates queries like "performance" (web-perf vs team health vs fitness). +Use `multi-get` when comparing several hits or gathering context across pages. -### Combining Types +### Output is line-numbered and carries the docid — cite both -| Goal | Approach | -| --------------------- | ----------------------------------------------------- | -| Know exact terms | `lex` only | -| Don't know vocabulary | Use a single-line query (implicit `expand:`) or `vec` | -| Best recall | `lex` + `vec` | -| Complex topic | `lex` + `vec` + `hyde` | -| Ambiguous query | Add `intent` to any combination above | +`get` and `multi-get` are **line-numbered by default** and always print the +document's `#docid` and `qmd://` path. So `get` output looks like: -First query gets 2x weight in fusion — put your best guess first. +```text +qmd://concepts/note.md #abc123 +--- -### Lex Query Syntax +1: # Metrics as instruments +2: +3: Treat dashboards like cockpit instruments... +``` -| Syntax | Meaning | Example | -| ---------- | ------------ | ---------------------------- | -| `term` | Prefix match | `perf` matches "performance" | -| `"phrase"` | Exact phrase | `"rate limiter"` | -| `-term` | Exclude | `performance -sports` | +Cite the docid and exact line numbers in your answer, and use the numbers to ask +for the next slice. Pass `--no-line-numbers` only when you need raw content to +copy verbatim (e.g. reproducing a code block). -Note: `-term` only works in lex queries, not vec/hyde. +When you need to open or edit the underlying file (e.g. hand a path to `Read`, +`Edit`, or an editor), add `--full-path`. It replaces the `qmd://` URL + docid +header with the document's on-disk path, falling back to the canonical header if +the file no longer exists on disk: -### Collection Filtering +```text +$ qmd get "#abc123" --full-path +/Users/you/notes/concepts/note.md +--- -```json -{ "collections": ["docs"] } // Single -{ "collections": ["docs", "notes"] } // Multiple (OR) +1: # Metrics as instruments ``` -Omit to search all collections. +`--full-path` works the same way on `qmd search` and `qmd query`: result paths +become the file's on-disk path — `./`-prefixed relative path when the file is +inside `$PWD`, absolute realpath otherwise — and the per-result `#docid` is +dropped because the path is the identifier. The leading `./` is intentional so +the output is unambiguously a filesystem path and cannot be mistaken for a bare +collection-relative string. Default search/query output still uses `qmd://` +URIs; only opt into `--full-path` when you specifically need a path you can hand +to a non-QMD tool. -## Other MCP Tools +### Read line ranges with the `:from:count` suffix — never pipe through `sed`/`head`/`tail` -| Tool | Use | -| ----------- | -------------------------------- | -| `get` | Retrieve doc by path or `#docid` | -| `multi_get` | Retrieve multiple by glob/list | -| `status` | Collections and health | +`qmd get` slices files itself. Use the suffix or flags; do **not** shell out to +`sed -n`, `head`, `tail`, or `awk` to pull a line range. Piping defeats docid +resolution, virtual-path lookups, line numbering, and the header, and it is +slower and more error-prone. -## CLI +The most compact form is a `:from:count` suffix right on the path or docid — +prefer it: ```bash -qmd query "question" # Auto-expand + rerank -qmd query $'lex: X\nvec: Y' # Structured -qmd query $'expand: question' # Explicit expand -qmd query --json --explain "q" # Show score traces (RRF + rerank blend) -qmd search "keywords" # BM25 only (no LLM) -qmd get "#abc123" # By docid -qmd multi-get "journals/2026-*.md" -l 40 # Batch pull snippets by glob -qmd multi-get notes/foo.md,notes/bar.md # Comma-separated list, preserves order +qmd get "#abc123:120:40" # 40 lines starting at line 120 +qmd get qmd://concepts/note.md:200:60 # lines 200–259 +qmd get "#abc123:120" # from line 120 to end of file +qmd get "#abc123" --from 120 -l 40 # equivalent, using flags ``` -## HTTP API +Suffix and flags: + +- `::` — start at line ``, read `` lines. **Best + for reading around a search hit.** +- `:` — start at ``, read to end of file. +- `--from ` / `-l ` — flag equivalents. Explicit flags override the + suffix, so `... :5:2 -l 1` reads 1 line. +- `--no-line-numbers` — drop the `N:` prefixes (line numbers are on by default). + +Wrong: `qmd get "#abc123" | sed -n '120,160p'` +Right: `qmd get "#abc123:120:40"` + +Search results include a `:line` anchor on each hit — feed it straight into +`qmd get path:line:` to read a window around the match (line numbers in the +output will start at `line`). + +## Discover what is indexed ```bash -curl -X POST http://localhost:8181/query \ - -H "Content-Type: application/json" \ - -d '{"searches": [{"type": "lex", "query": "test"}]}' +qmd collection list +qmd ls +qmd status ``` -## Setup +Add collection filters when broad searches drift into the wrong corpus: + +```bash +qmd search "headcount autonomous agents" -c concepts -n 10 +qmd query "merchant support product reality" -c concepts -c sources -n 10 +``` + +Omit `-c` to search everything. + +## MCP Tool: `query` + +When using the MCP server, prefer structured searches: + +```json +{ + "searches": [ + { "type": "lex", "query": "cockpit OKR Goodhart" }, + { "type": "vec", "query": "data informed not metric driven product judgment" }, + { + "type": "hyde", + "query": "A concept note explains that metrics are useful as instruments, but leaders should not let OKRs or dashboards replace judgment." + } + ], + "intent": "Find the concept note about using metrics as instruments without becoming metric-driven.", + "collections": ["concepts"], + "limit": 10 +} +``` + +Query types: + +- `lex` — BM25 keyword search. Best for exact terms, names, titles, and code. +- `vec` — vector semantic search. Best for natural-language concepts. +- `hyde` — vector search using a hypothetical answer/document passage. + +## Query craft + +Good QMD searches mix three things: + +1. **Title/alias anchors:** exact page titles, named entities, phrases. +2. **Semantic paraphrase:** how a human would describe the idea. +3. **Negative space:** enough intent to avoid nearby-but-wrong concepts. + +Examples: + +```bash +# Exact-ish title lookup +qmd search '"arm the rebels" merchants tools big companies' -c concepts + +# Semantic concept lookup +qmd query $'intent: Find the customer proximity concept, not generic customer delight.\nlex: support pseudonymous merchant customer interviews\nvec: founder stays close to merchant reality through support and product use' + +# Source lookup +qmd search "six-week cadence WhatsApp merchant relationships Shawn Ryan" -c sources -n 10 +``` + +## Setup and maintenance + +Only mutate indexes when the user asked for setup or maintenance. Searching and +retrieving are safe; collection/index mutation is not a casual first step. ```bash npm install -g @tobilu/qmd qmd collection add ~/notes --name notes +qmd update qmd embed ``` + +Health and diagnostics: + +```bash +qmd doctor +qmd status +qmd pull +``` + +`qmd doctor` checks config, model cache, device/GPU setup, vector fingerprints, +and common environment overrides. If a model-backed command fails, run it before +changing configuration. + +## MCP setup + +See `references/mcp-setup.md` for Claude Code, Claude Desktop, OpenClaw, and HTTP +server configuration. + +## Pitfalls + +- **Do not stop at snippets.** Fetch documents before making claims. +- **Do not slice files with `sed`/`head`/`tail`.** Use the `path:from:count` + suffix (e.g. `qmd get "#abc123:120:40"`) or `--from`/`-l`. Output is already + line-numbered; piping breaks docid resolution, the header, and virtual paths. +- **Do not lean on query expansion.** Write `intent:`/`lex:`/`vec:`/`hyde:` + yourself. A bare `qmd query "user sentence"` discards the context only you + have. You expand the query; the model just ranks. +- **Do not overuse semantic search.** If you know exact titles or terms, BM25 is + faster and often better. +- **Do not mutate indexes casually.** `qmd collection add`, `qmd update`, and + `qmd embed` change local state and can be expensive. +- **Model-backed commands can be environment-sensitive.** If `qmd query`, + `qmd vsearch`, or reranking fails because local models/GPU are unavailable, + use `qmd search` and stronger lexical/structured terms. +- **Ambiguous user wording needs intent.** Add `intent:` rather than hoping query + expansion guesses the right domain. +- **Collection names matter.** Search `concepts` for synthesized wiki pages, + `sources` for transcripts/raw source pages, and docs collections for code or + project documentation.