13 community PRs merged. GPU initialization replaced with node-llama-cpp's
built-in autoAttempt — deleting ~220 lines of manual fallback code and
fixing GPU issues reported across 10+ PRs in one shot. Reranking is faster
through chunk deduplication and a parallelism cap that prevents VRAM
exhaustion.
build: "autoAttempt" instead of manual
GPU backend detection. Automatically tries Metal/CUDA/Vulkan and falls back
gracefully. #310 (thanks @giladgd — the node-llama-cpp author)--explain: qmd query --explain exposes retrieval score traces
— backend scores, per-list RRF contributions, top-rank bonus, reranker
score, and final blended score. Works in JSON and CLI output. #242
(thanks @vyalamar)ignore: ["Sessions/**", "*.tmp"] in
collection config to exclude files from indexing. #304 (thanks @sebkouba)QMD_EMBED_MODEL env var lets you swap in
models like Qwen3-Embedding for non-English collections. #273 (thanks
@daocoding)QMD_EXPAND_CONTEXT_SIZE env var
(default 2048) — previously used the model's full 40960-token window,
wasting VRAM. #313 (thanks @0xble)candidateLimit exposed: -C / --candidate-limit flag and MCP
parameter to tune how many candidates reach the reranker. #255 (thanks
@pandysp)qmd update is run. #312 (thanks @0xble)🐘.md → 1f418.md) instead of crashing.
#308 (thanks @debugerman)[] for JSON, CSV header for CSV,
etc.) instead of plain text "No results." #228 (thanks @amsminn)sqlite-vec-windows-x64) and add
sqlite-vec-linux-arm64. #225 (thanks @ilepn)QMD now speaks in query documents — structured multi-line queries where every line is typed (lex:, vec:, hyde:), combining keyword precision with semantic recall. A single plain query still works exactly as before (it's treated as an implicit expand: and auto-expanded by the LLM). Lex now supports quoted phrases and negation ("C++ performance" -sports -athlete), making intent-aware disambiguation practical. The formal query grammar is documented in docs/SYNTAX.md.
The npm package now uses the standard #!/usr/bin/env node bin convention, replacing the custom bash wrapper. This fixes native module ABI mismatches when installed via bun and works on any platform with node >= 22 on PATH.
lex:, vec:, hyde:). Plain queries remain the default (expand: implicit, but not written inside the document). First sub-query gets 2× fusion weight — put your strongest signal first. Formal grammar in docs/SYNTAX.md."exact phrase" for verbatim matching; -term and -"phrase" for exclusions. Essential for disambiguation when a term is overloaded across domains (e.g. performance -sports -athlete).expand: shortcut: send a single plain query (or start the document with expand: on its only line) to auto-expand via the local LLM. Query documents themselves are limited to lex, vec, and hyde lines.query tool (renamed from structured_search): rewrote the tool description to fully teach AI agents the query document format, lex syntax, and combination strategy. Includes worked examples with intent-aware lex./query endpoint (renamed from /search; /search kept as silent alias).collections array filter: filter by multiple collections in a single query (collections: ["notes", "brain"]). Removed the single collection string param — array only.include/exclude: includeByDefault: false hides a collection from all queries unless explicitly named via collections. CLI: qmd collection exclude <name> / qmd collection include <name>.update-cmd: attach a shell command that runs before every qmd update (e.g. git stash && git pull --rebase --ff-only && git stash pop). CLI: qmd collection update-cmd <name> '<cmd>'.qmd status tips: shows actionable tips when collections lack context descriptions or update commands.qmd collection subcommands: show, update-cmd, include, exclude. Bare qmd collection now prints help.#!/usr/bin/env node shebang on dist/qmd.js. Fixes native module ABI mismatches when installed via bun, and works on any platform where node >= 22 is on PATH.search, vector_search, deep_search — all superseded by query.qmd context check command.Expanding query... (4.2s)).qmd collection list shows [excluded] tag for collections with includeByDefault: false.includeByDefault — excluded collections are skipped unless explicitly named.-c flags to search across several collections at
once (e.g. qmd search -c notes -c journals "query"). #191 (thanks
@openclaw)[] instead of no output when --json search
finds no results.--index so they don't produce malformed
config entries.XDG_CONFIG_HOME for collection config path instead of always
using ~/.config. #190 (thanks @openclaw)collection add command.
#200 (thanks @vincentkoc)qmd status now shows models with full HuggingFace links instead of
static names in --help. Model info is derived from the actual configured
URIs so it stays accurate if models change.The npm package now ships compiled JavaScript instead of raw TypeScript,
removing the tsx runtime dependency. A new /release skill automates the
full release workflow with changelog validation and git hook enforcement.
dist/ via tsc so the npm package no longer
requires tsx at runtime. The qmd shell wrapper now runs dist/qmd.js
directly./release skill that manages the full release
lifecycle — validates changelog, installs git hooks, previews release notes,
and cuts the release. Auto-populates [Unreleased] from git history when
empty.scripts/extract-changelog.sh extracts cumulative notes
for the full minor series (e.g. 1.0.0 through 1.0.5) for GitHub releases.
Includes [Unreleased] content in previews.scripts/release.sh renames [Unreleased] to a versioned
heading and inserts a fresh empty [Unreleased] section automatically.v* tag pushes unless
package.json version matches the tag, a changelog entry exists, and CI
passed on GitHub.QMD now runs on both Node.js and Bun, with up to 2.7x faster reranking
through parallel GPU contexts. GPU auto-detection replaces the unreliable
gpu: "auto" with explicit CUDA/Metal/Vulkan probing.
src/db.ts). bun:sqlite on Bun, better-sqlite3 on
Node. The qmd wrapper auto-detects a suitable Node.js install via PATH,
then falls back to mise, asdf, nvm, and Homebrew locations.gpu: "auto". qmd status shows device info.test/ directory with vitest for Node.js and
bun test for Bun. New eval-bm25 and store.helpers.unit suites.embedBatch calls — initialization lock now covers the full path.First published release on npm as @tobilu/qmd. MCP HTTP transport with
daemon mode cuts warm query latency from ~16s to ~10s by keeping models
loaded between requests.
qmd mcp --http --daemon
starts a background server, qmd mcp stop shuts it down. Models stay warm
in VRAM between queries. #149 (thanks @igrigorik)hybridQuery() and
vectorSearchQuery() to store.ts so CLI and MCP share identical logic.
Fixes a class of bugs where results differed between the two. #149 (thanks
@igrigorik)1/(1+|x|) instead of
|x|/(1+|x|)), so strong matches scored lowest. Broke --min-score
filtering and made the "strong signal" short-circuit dead code. #76 (thanks
@dgilperez)Fine-tuned query expansion model trained with GRPO replaces the stock Qwen3 0.6B. The training pipeline scores expansions on named entity preservation, format compliance, and diversity — producing noticeably better lexical variations and HyDE documents.
/only:lex mode for single-type expansions — useful when you know
which search backend will help.withLLMSession() pattern — ensures
cleanup even on failure, similar to database transactions.collectionName column in vector search SQL (was still using old
collectionId from before YAML migration). #61 (thanks @jdvmi00)--index option to CLI argument parser (was documented but not wired
up). #84 (thanks @Tritlo)First community contributions. The project gained external contributors, surfacing bugs that only appear in diverse environments — Homebrew sqlite-vec paths, case-sensitive model filenames, and sqlite-vec JOIN incompatibilities.
realpathSync() replaces readlink -f subprocess spawn
per file. On a 5000-file collection this eliminates 5000 shell spawns,
~15% faster. #8 (thanks @burke)vsearch and query hanging — sqlite-vec's virtual table doesn't
support the JOIN pattern used; rewrote to subquery. #23 (thanks @mbrendan)BREW_PREFIX detection).
#42 (thanks @komsit37)src/qmd.ts path. #7 (thanks @burke)Replaced Ollama HTTP API with node-llama-cpp for all LLM operations. Ollama adds convenience but also a running server dependency. node-llama-cpp loads GGUF models directly in-process — zero external dependencies. Models auto-download from HuggingFace on first use.
Collections and contexts moved from SQLite tables to YAML at
~/.config/qmd/index.yml. SQLite was overkill for config — you can't share
it, and it's opaque. YAML is human-readable and version-controllable. The
migration was extensive (35+ commits) because every part of the system that
touched collections or contexts had to be updated.
collections and path_contexts tables dropped from schema. Collections
support an optional update: command (e.g., git pull) before re-index.qmd collection add/list/remove/rename commands with --name and
--mask glob pattern support.qmd ls virtual file tree — list collections, files in a collection,
or files under a path prefix.qmd context add/list/check/rm with hierarchical context inheritance.
A query to qmd://notes/2024/jan/ inherits context from notes/,
notes/2024/, and notes/2024/jan/.qmd context add / "text" for global context across all collections.qmd context check audit command to find paths without context.qmd:// virtual URI scheme for portable document references.
qmd://notes/ideas.md works regardless of where the collection lives on
disk. Works in get, multi-get, ls, and context commands.#abc123 in search results, usable with get and
multi-get.--line-numbers flag for get command output.MCP server for AI agent integration. Without it, agents had to shell out to
qmd search and parse CLI output. The monolithic qmd.ts (1840 lines) was
split into focused modules with the project's first test suite (215 tests).
mimeType, added isError: true to errors, structuredContent for
machine-readable results, proper URI encoding.qmd_search → search) since MCP already
namespaces by server.store.ts (1221 LOC), llm.ts (539 LOC),
formatter.ts (359 LOC), mcp.ts (503 LOC) from monolithic qmd.ts.Document chunking for vector search. A 5000-word document about many topics gets a single embedding that averages everything together, matching poorly for specific queries. Chunking produces one embedding per ~900-token section with focused semantic signal.
--all flag returns all matches (use with --min-score to filter).embed command.--json, --csv, --files, --md, --xml output format flags.
--json for programmatic access, --files for piping, --md/--xml for
LLM consumption, --csv for spreadsheets.qmd status shows index health — document count, size, embedding
coverage, time since last update.Initial implementation. Built in a single day for searching personal markdown notes, journals, and meeting transcripts.
qmd add, qmd embed, qmd search, qmd vsearch, qmd query,
qmd get. ~1800 lines of TypeScript in a single qmd.ts file.