Internal fork of @tobilu/qmd — local hybrid search (BM25 + vector). Upstream: github.com/tobi/qmd. Vendored into /srv/vendor/qmd, consumed by Oivo CLI via file: dep as @oivo/qmd.
|
|
5 kuukautta sitten | |
|---|---|---|
| .gitignore | 5 kuukautta sitten | |
| CLAUDE.md | 5 kuukautta sitten | |
| README.md | 5 kuukautta sitten | |
| bun.lock | 5 kuukautta sitten | |
| package.json | 5 kuukautta sitten | |
| qmd | 5 kuukautta sitten | |
| qmd.ts | 5 kuukautta sitten | |
| tsconfig.json | 5 kuukautta sitten |
A CLI tool for searching markdown knowledge bases using hybrid retrieval: combining BM25 full-text search, vector semantic search, and LLM re-ranking.
┌─────────────────────────────────────────────────────────────────────────────┐
│ QMD Search Pipeline │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────┐
│ User Query │
└────────┬────────┘
│
┌──────────────┴──────────────┐
▼ ▼
┌────────────────┐ ┌────────────────┐
│ Query Expansion│ │ Direct Query │
│ (qwen3:0.6b) │ │ (×2 weight) │
└───────┬────────┘ └───────┬────────┘
│ │
│ 1 alternative query │
└──────────────┬──────────────┘
│
┌─────────────────┼─────────────────┐
▼ ▼ ▼
┌────────────────┐ ┌────────────────┐ ┌────────────────┐
│ FTS Search │ │ FTS Search │ │ FTS Search │
│ (BM25) │ │ (BM25) │ │ (BM25) │
└───────┬────────┘ └───────┬────────┘ └───────┬────────┘
│ │ │
┌───────┴────────┐ ┌───────┴────────┐ ┌───────┴────────┐
│ Vector Search │ │ Vector Search │ │ Vector Search │
│(embeddinggemma)│ │(embeddinggemma)│ │(embeddinggemma)│
└───────┬────────┘ └───────┬────────┘ └───────┬────────┘
│ │ │
└──────────────────┼──────────────────┘
│
▼
┌───────────────────────┐
│ RRF Fusion + Bonus │
│ (Top-rank preserved) │
│ Top 30 Kept │
└───────────┬───────────┘
│
▼
┌───────────────────────┐
│ LLM Re-ranking │
│ (qwen3-reranker) │
│ Yes/No + logprobs │
└───────────┬───────────┘
│
▼
┌───────────────────────┐
│ Position-Aware Blend │
│ (RRF + Reranker) │
└───────────────────────┘
| Backend | Raw Score | Conversion | Range |
|---|---|---|---|
| FTS (BM25) | SQLite FTS5 BM25 | Math.abs(score) |
0 to ~25+ |
| Vector | Cosine distance | 1 / (1 + distance) |
0.0 to 1.0 |
| Reranker | LLM 0-10 rating | score / 10 |
0.0 to 1.0 |
The query command uses Reciprocal Rank Fusion (RRF) with position-aware blending:
score = Σ(1/(k+rank+1)) where k=60Why this approach: Pure RRF can dilute exact matches when expanded queries don't match. The top-rank bonus preserves documents that score #1 for the original query. Position-aware blending prevents the reranker from destroying high-confidence retrieval results.
| Score | Meaning |
|---|---|
| 0.8 - 1.0 | Highly relevant |
| 0.5 - 0.8 | Moderately relevant |
| 0.2 - 0.5 | Somewhat relevant |
| 0.0 - 0.2 | Low relevance |
macOS: Homebrew SQLite (for extension support)
brew install sqlite
Ollama running locally (default: http://localhost:11434)
QMD uses three models (auto-pulled if missing):
| Model | Purpose | Size |
|---|---|---|
embeddinggemma |
Vector embeddings | ~1.6GB |
ExpedientFalcon/qwen3-reranker:0.6b-q8_0 |
Re-ranking (trained) | ~640MB |
qwen3:0.6b |
Query expansion | ~400MB |
# Pre-pull models (optional)
ollama pull embeddinggemma
ollama pull ExpedientFalcon/qwen3-reranker:0.6b-q8_0
ollama pull qwen3:0.6b
bun install
# Index all .md files in current directory
qmd index
# Index with custom glob pattern
qmd index "**/*.md"
# Index specific directory
qmd index "docs/**/*.md"
# Embed all indexed documents
qmd embed
┌──────────────────────────────────────────────────────────────────┐
│ Search Modes │
├──────────┬───────────────────────────────────────────────────────┤
│ search │ BM25 full-text search only │
│ vsearch │ Vector semantic search only │
│ query │ Hybrid: FTS + Vector + Query Expansion + Re-ranking │
└──────────┴───────────────────────────────────────────────────────┘
# Full-text search (fast, keyword-based)
qmd search "authentication flow"
# Vector search (semantic similarity)
qmd vsearch "how to login"
# Hybrid search with re-ranking (best quality)
qmd query "user authentication"
-n <num> # Number of results (default: 5)
--min-score <num> # Minimum score threshold (default: 0)
--full # Show full document content
-csv # CSV output (for piping/scripting)
-md # Output as markdown
-xml # Output as XML
--index <name> # Use named index
Default output is colorized CLI format (respects NO_COLOR env):
93% docs/guide.md:42
│ This section covers the **craftsmanship** of building
│ quality software with attention to detail.
│ See also: engineering principles
67% notes/meeting.md:15
│ Discussion about code quality and craftsmanship
│ in the development process.
# Get 10 results with minimum score 0.3
qmd query -n 10 --min-score 0.3 "API design patterns"
# Output as markdown for LLM context
qmd search -md --full "error handling"
# Use separate index for different knowledge base
qmd --index work search "quarterly reports"
# List all indexed collections
qmd list
# Show database statistics
qmd stats
# Forget a collection
qmd forget
Index stored in: ~/.cache/qmd/index.sqlite
collections -- Indexed directories and glob patterns
documents -- Markdown content with metadata
documents_fts -- FTS5 full-text index
content_vectors -- Embedding cache (by content hash)
vectors_vec -- sqlite-vec vector index
| Variable | Default | Description |
|---|---|---|
OLLAMA_URL |
http://localhost:11434 |
Ollama API endpoint |
XDG_CACHE_HOME |
~/.cache |
Cache directory location |
Markdown Files ──► Parse Title ──► Hash Content ──► Store in SQLite
│ │
└─► FTS5 Index ◄─────────────────────┘
Document ──► Format for EmbeddingGemma ──► Ollama API ──► Store Vector
"title: X | text: Y" /api/embed
Query ──► Expand (3 variations) ──► FTS + Vector (per variation)
│
▼
Merge (max score)
│
▼
Top 25 candidates
│
▼
LLM Re-rank (0-10)
│
▼
Final ranked results
Models are configured as constants in qmd.ts:
const DEFAULT_EMBED_MODEL = "embeddinggemma";
const DEFAULT_RERANK_MODEL = "ExpedientFalcon/qwen3-reranker:0.6b-q8_0";
const DEFAULT_QUERY_MODEL = "qwen3:0.6b";
// For queries
"task: search result | query: {query}"
// For documents
"title: {title} | text: {content}"
A dedicated reranker model trained on relevance classification:
System: Judge whether the Document meets the requirements based on the Query
and the Instruct provided. Note that the answer can only be "yes" or "no".
User: <Instruct>: Given a search query, determine if the document is relevant...
<Query>: {query}
<Document>: {doc}
logprobs: true to extract token probabilitiesnum_predict: 1 - Only need the yes/no tokennum_predict: 150 - For generating query variationsMIT