Tobi Lütke
|
0a941c442f
perf: flash attention, right-sized contexts, cleaner GPU detection
|
3 ماه پیش |
Tobi Lütke
|
4ac95b5e26
perf: adaptive parallel contexts for embed + rerank, fix VRAM waste
|
3 ماه پیش |
Tobi Lütke
|
0a0e1e6f29
perf: parallel reranking with multiple contexts (2.7x speedup)
|
3 ماه پیش |
Tobi Lütke
|
ee86bba45e
feat: auto-detect GPU acceleration + device info in status
|
3 ماه پیش |
Tobi Lütke
|
102ff861d3
fix: use Qwen3 recommended sampling params to prevent repetition loops
|
3 ماه پیش |
Tobi Lütke
|
479b68bbf1
add qmd model pull and refresh logic
|
3 ماه پیش |
Tobi Lutke
|
7de18ee066
Merge main into finetune
|
3 ماه پیش |
Tobi Lutke
|
785620467a
refactor: reorder output format to put hyde line first
|
3 ماه پیش |
Tobi Lütke
|
32d313ad6b
Add LLM session management for lifecycle safety
|
3 ماه پیش |
Christopher Jones
|
6d9871d2f5
Fix DisposedError during slow batch embedding (#41)
|
3 ماه پیش |
Tobias Lütke
|
eb1b77c8cb
Deploy fine-tuned GRPO model as default query expansion (#67)
|
3 ماه پیش |
Tobi Lutke
|
8572c2fd94
Deploy fine-tuned GRPO model as default for query expansion
|
3 ماه پیش |
Freeman Jiang
|
bfb0eebc3e
fix: use sequential embedding on CPU-only systems to avoid race condition (#54)
|
3 ماه پیش |
Sergey Gavrilyuk
|
bebee61bec
Fix case sensitivity in Qwen3-1.7B model filename
|
4 ماه پیش |
Tobi Lutke
|
4d21c5ab2b
Fix collection filter SQL and support non-ASCII filenames
|
4 ماه پیش |
Tobi Lutke
|
0dfd7a4686
Fix query hang, SQL errors, and missing docid in search results
|
4 ماه پیش |
Tobi Lutke
|
c85889df12
fixes
|
5 ماه پیش |
Tobi Lutke
|
c9ac3c1463
Use default createContext() options for better VRAM management
|
5 ماه پیش |
Tobi Lutke
|
f39db3a593
Fix VRAM and sequence exhaustion issues in generation
|
5 ماه پیش |
Tobi Lutke
|
10c5ec016f
Simplify disposal: let llama cascade to children, remove test dispose calls
|
5 ماه پیش |
Tobi Lutke
|
e8f4dce0b7
Fix Metal backend crash by properly disposing llama resources
|
5 ماه پیش |
Tobi Lutke
|
4131c827de
Make LlamaCpp dispose idempotent and avoid Metal backend crash
|
5 ماه پیش |
Tobi Lutke
|
25f8d185f4
Add lazy model loading with 2-minute inactivity auto-unload
|
5 ماه پیش |
Tobi Lutke
|
4385a6a8f6
Fix HyDE prompt to generate actual content instead of meta-description
|
5 ماه پیش |
Tobi Lutke
|
d383b5c226
Migrate to node-llama-cpp and add structured query expansion
|
5 ماه پیش |
Tobi Lutke
|
529e989d83
Refactor: Move TypeScript source files to src/ directory
|
5 ماه پیش |