---
name: qmd
-description: Search markdown knowledge bases, notes, and documentation using QMD. Use when users ask to search notes, find documents, or look up information.
+description: Search local markdown knowledge bases, notes, docs, and wikis with QMD. Use when users ask to find notes, retrieve documents, inspect a wiki, answer from indexed markdown, or set up QMD access.
license: MIT
compatibility: Requires qmd CLI or MCP server. Install via `npm install -g @tobilu/qmd`.
metadata:
author: tobi
- version: '2.0.0'
+ version: '2.2.0'
allowed-tools: Bash(qmd:*), mcp__qmd__*
---
-# QMD - Quick Markdown Search
+# QMD - Query Markdown Documents
-Local search engine for markdown content.
+## How search works
-## Status
+QMD searches local markdown collections: notes, docs, wikis, transcripts, and
+project knowledge bases. Use it before web search when the answer may already be
+in indexed local files.
-!`qmd status 2>/dev/null || echo "Not installed: npm install -g @tobilu/qmd"`
+The workflow is always:
-## MCP: `query`
+1. Search for candidate documents.
+2. Retrieve the full source with `qmd get` or `qmd multi-get`.
+3. Answer from retrieved text, citing paths or docids.
-```json
-{
- "searches": [
- { "type": "lex", "query": "CAP theorem consistency" },
- { "type": "vec", "query": "tradeoff between consistency and availability" }
- ],
- "collections": ["docs"],
- "limit": 10
-}
+Do not answer from snippets alone when the user needs facts, decisions, quotes,
+or nuance. Snippets are only leads.
+
+Typical loop:
+
+```bash
+qmd search "merchant reality support interviews" -n 5
+# leads: #abc123 concepts/customer-proximity.md; #def432 sources/merchant-call.md
+qmd multi-get "#abc123,#def432" --format md
```
-### Query Types
+**Default to structured `qmd query` with `intent:`, `lex:`, `vec:`, and `hyde:`
+fields that you write yourself.** You are a better query expander than the
+built-in model: you know the user's actual goal, the domain vocabulary, and the
+nearby-but-wrong concepts to avoid. Do not just paste the user's words into
+`qmd query "..."` and hope the expansion model guesses right — supply the
+`intent:` and craft the lexical and semantic terms deliberately (see
+[Pick the right search mode](#pick-the-right-search-mode)).
+
+When reporting what you retrieved, a compact note is enough; do not paste whole
+files unless needed:
+
+```text
+Retrieved:
+- #abc123 concepts/customer-proximity.md
+- #def432 sources/merchant-call.md
+```
-| Type | Method | Input |
-| ------ | ------ | ------------------------------------------- |
-| `lex` | BM25 | Keywords — exact terms, names, code |
-| `vec` | Vector | Question — natural language |
-| `hyde` | Vector | Answer — hypothetical result (50-100 words) |
+## Pick the right search mode
-### Writing Good Queries
+Use **BM25 lexical search** when you know exact words, titles, names, code
+symbols, or rare phrases:
-**lex (keyword)**
+```bash
+qmd search "cockpit OKR Goodhart" -n 10
+qmd search '"AI Before Headcount"' -c concepts -n 5
+```
-- 2-5 terms, no filler words
-- Exact phrase: `"connection pool"` (quoted)
-- Exclude terms: `performance -sports` (minus prefix)
-- Code identifiers work: `handleError async`
+Use **`qmd query` with structured fields** when the user describes an idea
+indirectly, uses different wording than the source, or needs conceptual recall.
+**This is the default mode — write the fields yourself rather than leaning on
+query expansion.** Combine exact anchors with semantic recall:
-**vec (semantic)**
+```bash
+qmd query $'intent: Find the concept note about metrics as instruments without letting OKRs replace judgment.\nlex: cockpit instruments OKR Goodhart metrics judgment\nvec: data informed not metric driven product judgment\nhyde: A concept note says metrics are useful like cockpit instruments, but leaders should remain data-informed rather than metric-driven because OKRs and dashboards can Goodhart product judgment.'
+```
-- Full natural language question
-- Be specific: `"how does the rate limiter handle burst traffic"`
-- Include context: `"in the payment service, how are refunds processed"`
+Structured query fields (you author each one — do not delegate this to the
+expansion model):
-**hyde (hypothetical document)**
+- `intent:` states what you are trying to find **and what to avoid**. Always
+ supply this. It steers ranking away from nearby-but-wrong concepts.
+- `lex:` exact terms, aliases, titles, code symbols, and rare words you expect
+ in the source. This is your own keyword expansion.
+- `vec:` paraphrases the idea in natural language, in source-like wording.
+- `hyde:` describes the document or answer that would satisfy the request.
-- Write 50-100 words of what the _answer_ looks like
-- Use the vocabulary you expect in the result
+You do not need all four every time, but you should almost always write at least
+`intent:` plus one of `lex:`/`vec:`. A bare `qmd query "the user's sentence"`
+throws away the context only you have and relies on the built-in expander to
+reconstruct it — prefer the structured form.
-**expand (auto-expand)**
+If you genuinely have nothing to expand (a single rare token, a verbatim phrase),
+that is a job for `qmd search`, not bare `qmd query`:
-- Use a single-line query (implicit) or `expand: question` on its own line
-- Lets the local LLM generate lex/vec/hyde variations
-- Do not mix `expand:` with other typed lines — it's either a standalone expand query or a full query document
+```bash
+qmd query --format json --explain $'intent: ...\nlex: ...\nvec: ...' # inspect ranking
+```
-### Intent (Disambiguation)
+If `qmd query` is slow or model/GPU setup fails, fall back to `qmd search` with
+better lexical terms.
-When a query term is ambiguous, add `intent` to steer results:
+## Retrieve sources
-```json
-{
- "searches": [{ "type": "lex", "query": "performance" }],
- "intent": "web page load times and Core Web Vitals"
-}
+Search results include docids like `#abc123` and `qmd://...` paths. Fetch them:
+
+```bash
+qmd get "#abc123"
+qmd get qmd://concepts/ai-before-headcount.md
+qmd multi-get "#abc123,#def432" --format md
+qmd multi-get 'concepts/{ai-before-headcount.md,data-informed-not-metric-driven.md}' --format md
+qmd multi-get 'sources/podcast-2025-*.md' -l 80
```
-Intent affects expansion, reranking, chunk selection, and snippet extraction. It does not search on its own — it's a steering signal that disambiguates queries like "performance" (web-perf vs team health vs fitness).
+Use `multi-get` when comparing several hits or gathering context across pages.
-### Combining Types
+### Output is line-numbered and carries the docid — cite both
-| Goal | Approach |
-| --------------------- | ----------------------------------------------------- |
-| Know exact terms | `lex` only |
-| Don't know vocabulary | Use a single-line query (implicit `expand:`) or `vec` |
-| Best recall | `lex` + `vec` |
-| Complex topic | `lex` + `vec` + `hyde` |
-| Ambiguous query | Add `intent` to any combination above |
+`get` and `multi-get` are **line-numbered by default** and always print the
+document's `#docid` and `qmd://` path. So `get` output looks like:
-First query gets 2x weight in fusion — put your best guess first.
+```text
+qmd://concepts/note.md #abc123
+---
-### Lex Query Syntax
+1: # Metrics as instruments
+2:
+3: Treat dashboards like cockpit instruments...
+```
-| Syntax | Meaning | Example |
-| ---------- | ------------ | ---------------------------- |
-| `term` | Prefix match | `perf` matches "performance" |
-| `"phrase"` | Exact phrase | `"rate limiter"` |
-| `-term` | Exclude | `performance -sports` |
+Cite the docid and exact line numbers in your answer, and use the numbers to ask
+for the next slice. Pass `--no-line-numbers` only when you need raw content to
+copy verbatim (e.g. reproducing a code block).
-Note: `-term` only works in lex queries, not vec/hyde.
+When you need to open or edit the underlying file (e.g. hand a path to `Read`,
+`Edit`, or an editor), add `--full-path`. It replaces the `qmd://` URL + docid
+header with the document's on-disk path, falling back to the canonical header if
+the file no longer exists on disk:
-### Collection Filtering
+```text
+$ qmd get "#abc123" --full-path
+/Users/you/notes/concepts/note.md
+---
-```json
-{ "collections": ["docs"] } // Single
-{ "collections": ["docs", "notes"] } // Multiple (OR)
+1: # Metrics as instruments
```
-Omit to search all collections.
+`--full-path` works the same way on `qmd search` and `qmd query`: result paths
+become the file's on-disk path — `./`-prefixed relative path when the file is
+inside `$PWD`, absolute realpath otherwise — and the per-result `#docid` is
+dropped because the path is the identifier. The leading `./` is intentional so
+the output is unambiguously a filesystem path and cannot be mistaken for a bare
+collection-relative string. Default search/query output still uses `qmd://`
+URIs; only opt into `--full-path` when you specifically need a path you can hand
+to a non-QMD tool.
-## Other MCP Tools
+### Read line ranges with the `:from:count` suffix — never pipe through `sed`/`head`/`tail`
-| Tool | Use |
-| ----------- | -------------------------------- |
-| `get` | Retrieve doc by path or `#docid` |
-| `multi_get` | Retrieve multiple by glob/list |
-| `status` | Collections and health |
+`qmd get` slices files itself. Use the suffix or flags; do **not** shell out to
+`sed -n`, `head`, `tail`, or `awk` to pull a line range. Piping defeats docid
+resolution, virtual-path lookups, line numbering, and the header, and it is
+slower and more error-prone.
-## CLI
+The most compact form is a `:from:count` suffix right on the path or docid —
+prefer it:
```bash
-qmd query "question" # Auto-expand + rerank
-qmd query $'lex: X\nvec: Y' # Structured
-qmd query $'expand: question' # Explicit expand
-qmd query --json --explain "q" # Show score traces (RRF + rerank blend)
-qmd search "keywords" # BM25 only (no LLM)
-qmd get "#abc123" # By docid
-qmd multi-get "journals/2026-*.md" -l 40 # Batch pull snippets by glob
-qmd multi-get notes/foo.md,notes/bar.md # Comma-separated list, preserves order
+qmd get "#abc123:120:40" # 40 lines starting at line 120
+qmd get qmd://concepts/note.md:200:60 # lines 200–259
+qmd get "#abc123:120" # from line 120 to end of file
+qmd get "#abc123" --from 120 -l 40 # equivalent, using flags
```
-## HTTP API
+Suffix and flags:
+
+- `<path>:<from>:<count>` — start at line `<from>`, read `<count>` lines. **Best
+ for reading around a search hit.**
+- `<path>:<from>` — start at `<from>`, read to end of file.
+- `--from <line>` / `-l <lines>` — flag equivalents. Explicit flags override the
+ suffix, so `... :5:2 -l 1` reads 1 line.
+- `--no-line-numbers` — drop the `N:` prefixes (line numbers are on by default).
+
+Wrong: `qmd get "#abc123" | sed -n '120,160p'`
+Right: `qmd get "#abc123:120:40"`
+
+Search results include a `:line` anchor on each hit — feed it straight into
+`qmd get path:line:<n>` to read a window around the match (line numbers in the
+output will start at `line`).
+
+## Discover what is indexed
```bash
-curl -X POST http://localhost:8181/query \
- -H "Content-Type: application/json" \
- -d '{"searches": [{"type": "lex", "query": "test"}]}'
+qmd collection list
+qmd ls
+qmd status
```
-## Setup
+Add collection filters when broad searches drift into the wrong corpus:
+
+```bash
+qmd search "headcount autonomous agents" -c concepts -n 10
+qmd query "merchant support product reality" -c concepts -c sources -n 10
+```
+
+Omit `-c` to search everything.
+
+## MCP Tool: `query`
+
+When using the MCP server, prefer structured searches:
+
+```json
+{
+ "searches": [
+ { "type": "lex", "query": "cockpit OKR Goodhart" },
+ { "type": "vec", "query": "data informed not metric driven product judgment" },
+ {
+ "type": "hyde",
+ "query": "A concept note explains that metrics are useful as instruments, but leaders should not let OKRs or dashboards replace judgment."
+ }
+ ],
+ "intent": "Find the concept note about using metrics as instruments without becoming metric-driven.",
+ "collections": ["concepts"],
+ "limit": 10
+}
+```
+
+Query types:
+
+- `lex` — BM25 keyword search. Best for exact terms, names, titles, and code.
+- `vec` — vector semantic search. Best for natural-language concepts.
+- `hyde` — vector search using a hypothetical answer/document passage.
+
+## Query craft
+
+Good QMD searches mix three things:
+
+1. **Title/alias anchors:** exact page titles, named entities, phrases.
+2. **Semantic paraphrase:** how a human would describe the idea.
+3. **Negative space:** enough intent to avoid nearby-but-wrong concepts.
+
+Examples:
+
+```bash
+# Exact-ish title lookup
+qmd search '"arm the rebels" merchants tools big companies' -c concepts
+
+# Semantic concept lookup
+qmd query $'intent: Find the customer proximity concept, not generic customer delight.\nlex: support pseudonymous merchant customer interviews\nvec: founder stays close to merchant reality through support and product use'
+
+# Source lookup
+qmd search "six-week cadence WhatsApp merchant relationships Shawn Ryan" -c sources -n 10
+```
+
+## Setup and maintenance
+
+Only mutate indexes when the user asked for setup or maintenance. Searching and
+retrieving are safe; collection/index mutation is not a casual first step.
```bash
npm install -g @tobilu/qmd
qmd collection add ~/notes --name notes
+qmd update
qmd embed
```
+
+Health and diagnostics:
+
+```bash
+qmd doctor
+qmd status
+qmd pull
+```
+
+`qmd doctor` checks config, model cache, device/GPU setup, vector fingerprints,
+and common environment overrides. If a model-backed command fails, run it before
+changing configuration.
+
+## MCP setup
+
+See `references/mcp-setup.md` for Claude Code, Claude Desktop, OpenClaw, and HTTP
+server configuration.
+
+## Pitfalls
+
+- **Do not stop at snippets.** Fetch documents before making claims.
+- **Do not slice files with `sed`/`head`/`tail`.** Use the `path:from:count`
+ suffix (e.g. `qmd get "#abc123:120:40"`) or `--from`/`-l`. Output is already
+ line-numbered; piping breaks docid resolution, the header, and virtual paths.
+- **Do not lean on query expansion.** Write `intent:`/`lex:`/`vec:`/`hyde:`
+ yourself. A bare `qmd query "user sentence"` discards the context only you
+ have. You expand the query; the model just ranks.
+- **Do not overuse semantic search.** If you know exact titles or terms, BM25 is
+ faster and often better.
+- **Do not mutate indexes casually.** `qmd collection add`, `qmd update`, and
+ `qmd embed` change local state and can be expensive.
+- **Model-backed commands can be environment-sensitive.** If `qmd query`,
+ `qmd vsearch`, or reranking fails because local models/GPU are unavailable,
+ use `qmd search` and stronger lexical/structured terms.
+- **Ambiguous user wording needs intent.** Add `intent:` rather than hoping query
+ expansion guesses the right domain.
+- **Collection names matter.** Search `concepts` for synthesized wiki pages,
+ `sources` for transcripts/raw source pages, and docs collections for code or
+ project documentation.