Cronologia Commit

Autore SHA1 Messaggio Data
  Tobias Lütke 55c951b15e Merge pull request #349 from byheaven/fix/qwen3-embedding-model-filename-case 2 mesi fa
  Tobi Lutke 55f16460d0 fix(ci): guard LLM calls in CI and increase test timeouts 2 mesi fa
  Tobi Lutke c68904fe08 refactor: move CLI and MCP to subdirectories, MCP consumes SDK 2 mesi fa
  YuBai 740b17b485 docs: fix Qwen3-Embedding GGUF filename case in README and llm.ts 2 mesi fa
  Tobi Lutke ad38c1f698 feat: add intent parameter for query disambiguation 2 mesi fa
  Tobi Lutke e3549dab1a perf(rerank): cap parallelism, deduplicate chunks, cache by content 2 mesi fa
  Tobias Lütke 7904ab9a9d Merge pull request #273 from daocoding/feature/configurable-embed-model 2 mesi fa
  Tobias Lütke ee08997f23 Merge pull request #313 from 0xble/fix/expand-context-size-config 2 mesi fa
  Brian Le 0dec1df047 fix(llm): make expansion context size configurable 2 mesi fa
  Gilad S. 3095041e0f feat: use `build: "autoAttempt"` on `getLlama` 2 mesi fa
  Big (daocoding) b71649b12d feat: add QMD_EMBED_MODEL env var for multilingual embedding support 2 mesi fa
  Tobi Lütke 5233e676d9 fix(rerank): truncate documents exceeding 2048-token context size 3 mesi fa
  Tobias Lütke 67e2aab18c Merge pull request #206 from tobi/liquidai-query-expansion 3 mesi fa
  Tobi Lütke 57f7caa93b feat: add LiquidAI LFM2 support for query expansion 3 mesi fa
  Tobi Lutke 09803a75b7 feat: compile to JS for npm, release system, full changelog 3 mesi fa
  Tobi Lütke 392934e78a perf: CPU parallelism via multi-context thread splitting 3 mesi fa
  Tobi Lütke 0a941c442f perf: flash attention, right-sized contexts, cleaner GPU detection 3 mesi fa
  Tobi Lütke 4ac95b5e26 perf: adaptive parallel contexts for embed + rerank, fix VRAM waste 3 mesi fa
  Tobi Lütke 0a0e1e6f29 perf: parallel reranking with multiple contexts (2.7x speedup) 3 mesi fa
  Tobi Lütke ee86bba45e feat: auto-detect GPU acceleration + device info in status 3 mesi fa
  Tobi Lütke 102ff861d3 fix: use Qwen3 recommended sampling params to prevent repetition loops 3 mesi fa
  Tobi Lütke 479b68bbf1 add qmd model pull and refresh logic 3 mesi fa
  Tobi Lutke 7de18ee066 Merge main into finetune 3 mesi fa
  Tobi Lutke 785620467a refactor: reorder output format to put hyde line first 3 mesi fa
  Tobi Lütke 32d313ad6b Add LLM session management for lifecycle safety 3 mesi fa
  Christopher Jones 6d9871d2f5 Fix DisposedError during slow batch embedding (#41) 3 mesi fa
  Tobias Lütke eb1b77c8cb Deploy fine-tuned GRPO model as default query expansion (#67) 3 mesi fa
  Tobi Lutke 8572c2fd94 Deploy fine-tuned GRPO model as default for query expansion 3 mesi fa
  Freeman Jiang bfb0eebc3e fix: use sequential embedding on CPU-only systems to avoid race condition (#54) 3 mesi fa
  Sergey Gavrilyuk bebee61bec Fix case sensitivity in Qwen3-1.7B model filename 4 mesi fa