5 meses atrás · c8f72de12e
--- a/README.md
+++ b/README.md
@@ -112,7 +112,7 @@ Although the tool works perfectly fine when you just tell your agent to use it o
 
				                         ▼                             ▼
			
 
				                ┌────────────────┐            ┌────────────────┐
			
 
				                │ Query Expansion│            │  Original Query│
			
 
				-               │   (Qwen3-0.6B) │            │   (×2 weight)  │
			
 
				+               │   (Qwen3-1.7B) │            │   (×2 weight)  │
			
 
				                └───────┬────────┘            └───────┬────────┘
			
 
				                        │                             │
			
 
				                        │ 2 alternative queries       │
			
@@ -213,7 +213,7 @@ QMD uses three local GGUF models (auto-downloaded on first use):
 
				 |-------|---------|------|
			
 
				 | `embeddinggemma-300M-Q8_0` | Vector embeddings | ~300MB |
			
 
				 | `qwen3-reranker-0.6b-q8_0` | Re-ranking | ~640MB |
			
 
				-| `Qwen3-0.6B-Q8_0` | Query expansion | ~640MB |
			
 
				+| `Qwen3-1.7B-Q8_0` | Query expansion | ~2.2GB |
			
 
				 
			
 
				 Models are downloaded from HuggingFace and cached in `~/.cache/qmd/models/`.
			
 
				 
			
@@ -515,7 +515,7 @@ Models are configured in `src/llm.ts` as HuggingFace URIs:
 
				 ```typescript
			
 
				 const DEFAULT_EMBED_MODEL = "hf:ggml-org/embeddinggemma-300M-GGUF/embeddinggemma-300M-Q8_0.gguf";
			
 
				 const DEFAULT_RERANK_MODEL = "hf:ggml-org/Qwen3-Reranker-0.6B-Q8_0-GGUF/qwen3-reranker-0.6b-q8_0.gguf";
			
 
				-const DEFAULT_GENERATE_MODEL = "hf:ggml-org/Qwen3-0.6B-GGUF/Qwen3-0.6B-Q8_0.gguf";
			
 
				+const DEFAULT_GENERATE_MODEL = "hf:ggml-org/Qwen3-1.7B-GGUF/Qwen3-1.7B-Q8_0.gguf";
			
 
				 ```
			
 
				 
			
 
				 ### EmbeddingGemma Prompt Format