suby

suby запушил(а) oivo в suby/qmd

  • 4fe18a21bc feat(cli): qmd update/embed honor positional <collection> + --all flag (i-ofojj7dy) Both `qmd update` and `qmd embed` previously ignored the positional collection argument and processed every configured collection. On the Oivo fleet that's 15 collections — `qmd embed chat-archives` ran for 17m37s embedding all of them instead of the one the caller asked for, causing flow timeouts that we worked around earlier by bumping `whatsapp-archive-pipeline` Stage 1 from 5min to 30min (i-3crcsm7b commit a650abfdc). Changes: - `updateCollections(filter?)` filters `listCollections(db)` to a single entry when a positional name is supplied; unknown names exit non-zero with the available-collection list and remediation hints. - `generateEmbeddings({ collection })` + `getPendingEmbeddingDocs(db, c)` + `getHashesNeedingEmbedding(db, c)` accept an optional collection filter applied at the SQL layer (BEFORE the GROUP BY so we only emit hashes whose documents include that collection). Content-hash dedup across collections is preserved. - CLI threads the positional name + new `--all` boolean through `vectorIndex`. `--all` is an explicit alias for full-fleet; combining it with a positional name errors out. `--force` is fleet-wide and is refused alongside a positional name to avoid silently clearing every other collection's vectors (per-collection force-clear is out of scope). - Help text documents both forms for `update` and `embed`. Tests: - New `test/embed-collection-filter.test.ts` (9 cases, stub provider, no llama.cpp) covers `getHashesNeedingEmbedding`/`getPendingEmbeddingDocs`/ `generateEmbeddings` filter behavior + cross-collection isolation. - `test/cli.test.ts` gains 9 cases for the CLI surface: single-collection filter, unknown-collection error, `--all`, conflict on `--all + name`, `--force + name` conflict, etc. dist/ rebuilt per qmd CLAUDE.md mandate (fleet bundler consumes dist/ as-is). Resolves: i-ofojj7dy Session-Id: 50498c8e
  • f107c8a0aa perf(embed): 4-way concurrent dispatch + larger batches + per-batch txn (i-fkpnar9i) Phase 1 quick-win stack from i-fkpnar9i baseline analysis. Targets the 6-31 min runtime observed on small embed batches by removing the three client-side bottlenecks identified in the Phase 0 audit (/srv/.oivo/audits/i-fkpnar9i-qmd-embed-runtime/baseline.md): #1 Concurrent dispatcher in OpenAIEmbeddingsProvider.embedBatch - Replace the sequential for-of-chunks await loop with an N-worker pull-from-queue dispatcher (default N=4 = qmd-embed-worker semaphore). - Workers race on a shared `nextChunkIdx` cursor; each writes its embeddings into a pre-computed result-slot index so input order is preserved end-to-end without a final re-sort. - Circuit-breaker shouldFailFast is checked per-pull. If the breaker trips mid-run, a CircuitOpenError is captured and thrown AFTER all in-flight workers settle (no half-completed leaks). - Abort signal is checked per-pull; in-flight workers complete, no new dispatches. - Dimension-on-first-success race is benign — workers all observe the same length. - lastError uses last-write-wins (matches legacy semantics under concurrency). Cleared on fully-successful sweep. - Configurable: `concurrency` config opt + `QMD_EMBED_CONCURRENCY` env. Set to 1 to revert to legacy sequential dispatch. #2 Bump generateEmbeddings inner BATCH_SIZE 32 → 256 - The qmd CLI loop in generateEmbeddings calls embedMany with `BATCH_SIZE` chunks at a time. With BATCH_SIZE=32 and openai batchSize=64, embedBatch only ever saw 1 sub-chunk — so #1's concurrency dispatcher had nothing to parallelize. - 256 chunks per slice splits into 4 sub-chunks of 64 (worker MAX_BATCH=64), letting the dispatcher saturate the worker semaphore. - Configurable: `QMD_EMBED_INNER_BATCH_SIZE` env override. #3 Wrap insertEmbedding loop in db.transaction (BEGIN IMMEDIATE / COMMIT) - Every successful embedMany resolves with N embeddings. Previously each insertEmbedding call auto-committed (1 WAL fsync per row × 256 rows per batch = lots of fsync). Now wrapped in better-sqlite3's db.transaction(fn) so 256 inserts share one BEGIN/COMMIT. - Fallback per-chunk path (after batch embed throws) is intentionally NOT wrapped — keeps individual auto-commits so one bad chunk doesn't drag down others. - Database interface in src/db.ts extended with the `transaction` method to keep the call-site cast-free. Estimated impact (from baseline.md): 9544-chunk run: 31 min → ~9 min (~3.5x speedup) Effective inputs/sec: 4.5 → ~18 (4-way concurrent worker saturation) Tests added (10 new in test/embedding-openai.test.ts under "OpenAIEmbeddingsProvider — concurrent dispatch (i-fkpnar9i)" describe): - default concurrency=4 with N>workers → max 4 in flight - explicit concurrency=2 → only 2 in flight - concurrency=1 → legacy sequential (regression baseline) - result order preserved when last chunk resolves first - dimensions recorded correctly when chunk-1 resolves before chunk-0 - abort signal stops new dispatches; in-flight settle - constructor rejects concurrency < 1 - circuit-open mid-run throws after in-flight settle Verification (PHASE-1 IS Phase 1; concurrency stack is fully unit-tested): - npm run build clean - 256/256 tests pass across all touched files (embedding-openai, embedding-factory, embedding-store-integration, embedding-vsearch, lock-contention, mcp, store.helpers.unit, store-paths) - Pre-existing 4 LlamaCpp-Integration timeouts in store.test.ts are environmental (need local node-llama-cpp for expandQuery/rerank paths I do NOT touch); independently confirmed they hit the 30s vitest timeout regardless of these changes. Resolves: i-fkpnar9i (Phase 1) Session-Id: df47f0a2
  • ac0c96b8b9 fix(qmd): bump SQLite busy_timeout to 30s + add MCP RSS supervisor (i-6sw24v09) Per-session qmd MCP processes were timing out on `qmd_query`/`qmd_status` calls when a `qmd-cron embed` job was holding writer locks. Empirical runs on the Oivo fleet show `qmd embed` taking 6-31 minutes per cycle on a 30-min schedule, while the better-sqlite3 default `busy_timeout` was only 5s. Concurrent reader queries from the ~14 sister MCP processes hit SQLITE_BUSY before the embed completed, surfacing as MCP transport timeouts (~30s). `qmd_get` was unaffected because document- body retrieval bypasses the FTS5/vec contention path. Phase 2 mitigation — `applyConcurrencyPragmas` in `src/store.ts` sets WAL-friendly defaults (each overridable via env): busy_timeout = 30000 (was 5000 — better-sqlite3 default) synchronous = NORMAL (was FULL — safe in WAL, faster commits) temp_store = MEMORY (was DEFAULT — keep FTS5 sort scratch in RAM) cache_size = -65536 (~64 MiB; was -2000 / 2 MiB) mmap_size = 256 MiB (was 0) wal_autocheckpoint = 1000 (explicit; was driver default) Phase 3 defense — `startRssSupervisor` in `src/mcp/server.ts` poll-checks this MCP process's RSS and `process.exit(1)`s when it crosses QMD_MCP_RSS_LIMIT_BYTES, letting the parent respawn a fresh handle. Default OFF (env=0); opt in by setting e.g. 2147483648 (2 GiB). This contains the blast radius of any future memory leak in the search / expansion path without re-architecting the per-session MCP model. Tests in `test/lock-contention.test.ts` (15 cases): pragma defaults, env-override behavior, RSS supervisor lifecycle (triggers/no-trigger/ log-shape/exception-resilience), createStore integration. The dynamic writer-collision test was intentionally omitted — better-sqlite3 is synchronous and single-threaded, so intra-process busy_timeout deadlocks on its own JS timer; production behavior across separate MCP OS processes is delegated to SQLite-the-library. Forensic snapshots stashed at /srv/tmp/oivo-task-outputs/qmd-{runaway, cron-journal,sqlite-probe}-* document the empirical trace. The 5.4 GB "runaway" PID was VSZ misread; actual RSS was ~177 MB idle in do_epoll_wait. Index integrity_check returned `ok` in 42.8s — DB was fine, just slow under contention. Out of scope per issue: replacing qmd, multi-machine lock coord, re-architecting per-session MCP daemons. Resolves: i-6sw24v09 Session-Id: df47f0a2
  • e041f19285 fix(embedding): retry + rich error context for first-chunk dimension probe (i-vm1lxwry) Previously, when `qmd embed chat-archives` could not get embedding dimensions from the first chunk (e.g. transient HTTP 500/503 from ai.mm.mk, malformed JSON, timeout), `store.ts` threw a cryptic: Failed to get embedding dimensions from first chunk with no information about provider, endpoint, or underlying cause. Changes: * `EmbeddingProvider` interface: new optional `getLastError(): string | undefined` (source-compatible — existing 3rd-party impls keep working). * `OpenAIEmbeddingsProvider`: - tracks `lastError` on every swallowed per-chunk failure - clears it on a fully-successful sweep - new `formatErrorContext()` produces "endpoint=… status=N body=…" - new `getEndpoint()` exposes the configured base URL * `LocalLlamaCppProvider`: same `lastError` tracking + clearing for `llm.embed` / `llm.embedBatch` failures and aborts. * `AutoFallbackEmbeddingProvider.getLastError()`: combines primary + fallback last errors (`primary: … | fallback: …` when both failed). * `store.ts` first-chunk dimension probe: SINGLE retry on null result after a 250ms backoff (transient embedding-service issues), then throws a rich error including provider kind, endpoint, status code, body preview, and a hint to set `QMD_EMBED_DEBUG=1`. Tests (3 new describes, 11 new tests, 93 passing total): * `embedding-openai.test.ts § getLastError (i-vm1lxwry)` - undefined before first call - captures HTTP status + endpoint - captures malformed-JSON message - cleared after successful sweep - getEndpoint() strips trailing slashes * `embedding-autofallback.test.ts § getLastError (i-vm1lxwry)` - undefined when both legs clean - returns primary error / fallback error / combined * `embedding-store-integration.test.ts § first-chunk dimension probe` - retry succeeds on second attempt - throws rich error after both attempts fail (provider=openai, endpoint, status, body preview) Resolves: i-vm1lxwry Co-Authored-By: Claude <noreply@anthropic.com> Session-Id: 435c1d69
  • Просмотр сравнение для этих 4 коммитов »

1 неделя назад

suby запушил(а) oivo в suby/qmd

  • 1c4424fe1b ci(qmd): extend dist-sync trigger to oivo branch (i-j1ld0121) The CI workflow's push/pull_request triggers were scoped to branches:[main] only, leaving the active fork branch oivo without protection. Sibling commit 29e11bd (i-9l2ueyyz) shipped the pre-commit hook + dist-sync job as twin safety nets, but only main was wired to the CI safety net. Reproduced live in the i-1rqixh6m / i-gfv5zq2z incident: a dist-without- shebang commit shipped to oivo (99bd369), got fixed in 818c209 and 6129b08, but neither push triggered CI dist-sync because the workflow trigger only fires on main. Fleet-bound JS could have shipped broken. Two-line change: add oivo to both push.branches and pull_request.branches arrays. Existing main behavior preserved (upstream untouched). No new secrets or runner config — same test-node / test-bun / dist-sync matrix. Generated with [Claude Code](https://claude.ai/code) via [Oivo](https://oivo.com) Co-Authored-By: Claude <noreply@anthropic.com> Session-Id: d6a79169

3 недель назад

suby запушил(а) oivo в suby/qmd

  • 668c4d06e0 fix(llm): silence Vulkan probe + cmake build attempt for remote-only deployments (i-c28wngnd) When QMD_EMBED_ENDPOINT is set the embed path runs over HTTP, so node-llama-cpp is only needed for rerank/expand. On hosts without libvulkan-dev/glslc (e.g. the qmd-cron on `code` LXC), the previous `gpu: "auto"` probe wastes ~30s/run on a doomed Vulkan compile attempt and floods journalctl with cmake noise. Two layers of defense added: 1. `LlamaCpp.ensureLlama()` now defaults to `gpu: false` (CPU-only, prebuilt binary path) when `QMD_EMBED_ENDPOINT` is set without an explicit `QMD_LLAMA_GPU` override. The prebuilt CPU binary is always shipped via @node-llama-cpp/linux-x64, so rerank/expand still work — just on CPU. Override with `QMD_LLAMA_GPU=auto` to opt back in to Vulkan probing for hybrid local-rerank + remote-embed setups. 2. New `QMD_DISABLE_LOCAL_LLM=1` env var hard-disables the local LLM path entirely: `ensureLlama()` throws with an actionable error pointing at `EmbeddingProvider`. Use for cron-only deployments where any `getLlama()` call indicates an unintended fallback. Also suppresses the "running on CPU (slow)" warning when CPU was requested explicitly or auto-selected — there's nothing the operator can do about it on a remote-only host and the hint isn't relevant. Two helper functions exported for testability: - `isLocalLlmDisabled(env)` — reads QMD_DISABLE_LOCAL_LLM truthy values - `resolveLlamaGpuMode(env)` — returns "cpu" | "auto" Defense-in-depth follow-up to i-1rqixh6m (commit 99bd369), which addressed the primary trigger by giving `chunkDocumentByTokens` a provider-aware tokenizer. With both fixes, no embed path can transitively warm up node-llama-cpp on remote-only deployments. Acceptance verified live on `code`: - `QMD_EMBED_ENDPOINT=http://models:8082 qmd embed` produces ZERO Detecting/CMake/Failed-to-build log lines (timeout 30s, run with localBuilds/ pre-cleared). - 13 new unit tests pass; 3 pre-existing failures in LlamaCpp Integration suite are environmental (LXC has no GPU device) and unrelated. Generated with [Claude Code](https://claude.ai/code) via [Oivo](https://oivo.com) Co-Authored-By: Claude <noreply@anthropic.com> Session-Id: 56ed6a73

3 недель назад

suby запушил(а) oivo в suby/qmd

  • 6129b08057 chore(qmd): bump version 2.1.1-oivo.0 -> 2.2.0-oivo.0 Minor bump justified by additive API in i-1rqixh6m / i-loazq6ze: - New exported type: TokenCounter - chunkDocumentByTokens gains optional 8th param (tokenizer?) - searchVec/hybridQuery/structuredSearch gain optional embedProvider param - SDK SearchOptions/StoreOptions/VectorSearchOptions gain embedProvider field - New AutoFallbackEmbeddingProvider wrapper - Re-exports: createEmbeddingProvider, OpenAIEmbeddingsProvider, CircuitBreaker, AutoFallbackEmbeddingProvider, etc. All additive — backward-compat preserved (zero env-vars / zero options = identical pre-patch behavior, local llama-cpp path unchanged). Consumers: cli/package.json uses "file:../vendor/qmd" so no semver resolution actually runs at install time on the fleet. This bump is informational metadata for changelog tooling and any future caller that reads the field directly. Generated with [Claude Code](https://claude.ai/code) via [Oivo](https://oivo.com) Co-Authored-By: Claude <noreply@anthropic.com> Session-Id: 4c692810

3 недель назад

suby запушил(а) oivo в suby/qmd

  • 818c209d88 fix(qmd): restore shebang line to dist/cli/qmd.js (i-gfv5zq2z) My prior commit 99bd369 used raw tsc directly instead of the canonical "build" script, which strips the shebang prepend step from the pipeline. The script schedule was: tsc compile then prepend "#!/usr/bin/env node" to dist/cli/qmd.js then chmod +x. Without the shebang, every qmd CLI invocation gets parsed by bash (syntax error on first ES-module import line) — breaking embed, vsearch, query, mcp commands fleet-wide. Working tree had the shebang restored (sibling/verifier ran the build); this commit just lands that one-line fix. Closes i-gfv5zq2z (filed about TRACE-DEBUG in dist/llm.js — that content was already absent from HEAD at file time; the real issue turned out to be this shebang regression, caught only because of dist/cli/qmd.js drift surfacing during follow-up review). Lesson: when editing src/*.ts in vendor/qmd, ALWAYS use the canonical build script not raw tsc — per /srv/vendor/qmd/CLAUDE.md "Dist/ commit hygiene" section (landed via i-9l2ueyyz). Generated with [Claude Code](https://claude.ai/code) via [Oivo](https://oivo.com) Co-Authored-By: Claude <noreply@anthropic.com> Session-Id: 4c692810

3 недель назад

suby запушил(а) oivo в suby/qmd

  • 29e11bdc9b ci(hooks): pre-commit + CI gate for dist/ in sync with src/ (i-9l2ueyyz) Process hardening after the i-qkarfffa Stage-3 catch-up bug, where dist/embedding/* was compiled by the original session but never committed; only caught two days later when an unrelated session rebuilt dist/ and noticed. Changes: * scripts/pre-commit -- new. Detects staged src/*.ts; runs the package build script; aborts the commit if dist/ drifts (does NOT auto-stage -- silent commit-scope expansion was rejected in DoD #4). tsc errors also block. Skips the rebuild for docs-only commits. Bypass via standard `git commit --no-verify`. * scripts/install-hooks.sh -- now installs both pre-commit and pre-push (loop). package.json `prepare` script auto-runs this during install so contributors get the hook for free. * .github/workflows/ci.yml -- new dist-sync job installs deps, rebuilds dist/, then `git diff --exit-code -- dist/`. Fails the build with an actionable ::error:: line on drift. * CLAUDE.md -- new "Dist/ commit hygiene" section explains the workflow + the two safety nets, with the i-9l2ueyyz reference. Local hook tested in a scratch repo with 6 scenarios: A drift fails loudly B in-sync passes C docs-only fast-paths D idempotent E --no-verify bypasses F build-failure blocks Session-Id: a8d83ec3

3 недель назад

suby запушил(а) oivo в suby/qmd

  • 99bd369cdc feat(embedding): provider-aware tokenizer for chunkDocumentByTokens (i-1rqixh6m) Finishes DoD #1 of i-08ovbvtb / i-qkarfffa: setting QMD_EMBED_ENDPOINT and running qmd embed must NOT load node-llama-cpp. Sibling commit 20e44c9 landed the chunker refactor (TokenCounter type + tokenizer param + chunkTokenizer wiring) but left the integration tests unchanged and the SDK index.ts uncompilable. This commit: 1. Extends test/embedding-store-integration.test.ts with a vi.mock on ../src/llm.js whose getDefaultLlamaCpp throws "DoD #1 violation". All 10 integration tests now pass in 822ms (vs 22s on cold cache before the chunker refactor) — mathematical proof that node-llama-cpp is never loaded on the provider path. Adds a new dedicated test "provider mode does not call getDefaultLlamaCpp (DoD #3)" that asserts the spy is never called. 2. Fixes src/index.ts TS2304 errors at lines 205, 226, 257 introduced by sibling commit 20e44c9 — added a local `import type { EmbeddingProvider }` so the type is in scope for the SDK option interfaces (export-only re-exports do not put names in the file's own scope). 3. Rebuilds dist/ to match current src — sibling 20e44c9 left dist stale. Verification: - bun vitest run test/embedding-store-integration.test.ts: 10/10 in 822ms - bun vitest run test/embedding-{autofallback,factory,openai,provider}.test.ts + integration: 111/111 in 869ms - tsc -p tsconfig.build.json: clean (0 errors) - workspace_typecheck cli: clean (0 errors after baseline filter) Generated with [Claude Code](https://claude.ai/code) via [Oivo](https://oivo.com) Co-Authored-By: Claude <noreply@anthropic.com> Session-Id: 4c692810
  • 20e44c90b5 feat(embedding): query-side EmbeddingProvider with auto-fallback (i-loazq6ze) Routes ALL query-side embedding through EmbeddingProvider — searchVec, hybridQuery, structuredSearch, vectorSearchQuery, SDK store.search, CLI qmd vsearch / qmd query, and MCP HTTP /query / stdio query tool. When no provider is configured (zero env-vars / flags), the legacy local llama-cpp path is preserved verbatim. Wraps an OpenAI provider in AutoFallbackEmbeddingProvider (i-pdjn2xx5) when QMD_EMBED_AUTO_FALLBACK is on so transient ai.mm.mk outages degrade to local instead of throwing. Adds test/embedding-vsearch.test.ts (10 cases): stub provider routing, fallback chain success, both-fail surfaces error (single-embed) and empty results (batch), backward-compat precomputed-embedding path. Co-Authored-By: Claude <noreply@anthropic.com> Session-Id: 40541822
  • Просмотр сравнение для этих 2 коммитов »

3 недель назад

suby запушил(а) oivo в suby/qmd

  • 66cbadc06c fix(cli): add missing `red` field to color palette + ship dist (i-08ovbvtb) The `c.red` palette entry was referenced at `src/cli/qmd.ts:3199` (in the ModelMismatchError friendly-output branch added by the i-qkarfffa Stage-3 work) but never added to the palette object's literal definition at line 193. TypeScript flagged this on every typecheck since 0463dd5. Also commits the previously-untracked `dist/embedding/` directory. The i-qkarfffa Stage-3 commit added `src/embedding/{provider,openai,local, factory,autofallback,index}.ts` and `dist/store.js` imports `./embedding/provider.js` at runtime — but `dist/embedding/` itself was never committed, so consumers of `@oivo/qmd/dist/index.js` would have hit a `MODULE_NOT_FOUND` at import time. This commit ships the compiled artefacts alongside the i-08ovbvtb refactor of `dist/store.js`. Changes: * src/cli/qmd.ts: +1 line (`red: useColor ? "\x1b[31m" : ""`) * dist/cli/qmd.js, dist/index.{d.ts,js}, dist/store.{d.ts,js}: refreshed by `npm run` to reflect i-08ovbvtb's `withEmbedSession` helper + palette fix (regen, not hand-edit) * dist/embedding/{provider,openai,local,factory,autofallback,index} .{d.ts,js}: newly committed (existed in src/ since 0463dd5 but weren't tracked in dist/) Verification: * Typecheck inside vendor/qmd (`npm run` with the build script) exits 0 — all pre-existing TS errors resolved * dist/cli/qmd.js: `red: useColor ? "\x1b[31m" : ""` present at line 99 * dist/store.js: `withEmbedSession` helper present at line 1026 Generated with [Claude Code](https://claude.ai/code) via [Oivo](https://oivo.com) Co-Authored-By: Claude <noreply@anthropic.com> Session-Id: 6309e407
  • 6ebfc54cc7 refactor(embedding): skip withLLMSessionForLlm when embedProvider is supplied (i-08ovbvtb) Finishes the unmet half of i-qkarfffa DoD #9 ("No node-llama-cpp warm-up needed when remote-only"). Stage-3 (i-qkarfffa) shipped the `embedProvider` option that short-circuits `session.embed()` calls, but the OUTER `withLLMSessionForLlm(llm, ...)` wrapper still constructed an LLMSession against `getLlm(store)` — accessing `embedModelName` and holding a session lease for the duration of the embedding loop, even when the local LLM was never going to be used. This commit introduces a `withEmbedSession(store, provider, body)` helper that branches: * provider supplied -> creates a lightweight AbortController-backed fake session; `getLlm(store)` is NEVER called, `withLLMSessionForLlm` is bypassed entirely. Fake session's LLM-only methods (embed, embedBatch, expandQuery, rerank) throw with a clear message — they MUST NOT be reached because `embedOne`/`embedMany` route through `provider.embed()` instead. * provider undefined -> calls `withLLMSessionForLlm(getLlm(store), ...)` with the same `LLMSessionOptions` as before (maxDuration: 30 min, name: 'generateEmbeddings'). Local-only path is byte-identical. `embedModelUri` resolution is also moved behind the same branch — when provider is set, the model id comes from `provider.getModelId()`; otherwise it falls back to `getLlm(store).embedModelName`. So remote-only deployments no longer construct a `LlamaCpp` instance just to read its `embedModelName`. Test changes: * Removed the inline `, 30000` timeout override on the "uses provider.embedBatch when supplied" test (added in 058ec1d as a workaround). The global `testTimeout` (30s in vitest.config.ts) still applies, but the per-test bump and its DoD-#9 follow-up note are no longer needed. * Added a new test: "provider mode does not access store.llm" — sets `store.llm` to a Proxy that throws on ANY property access, then runs `generateEmbeddings({embedProvider})`. Catches future regressions of accidentally re-introducing `getLlm(store)` / `embedModelName` reads in the provider path. Verification: * vendor/qmd integration suite: 9/9 PASS (test/embedding-store-integration.test.ts). * vendor/qmd full suite: 897/907 PASS — 10 pre-existing `test/sdk.test.ts > with LLM query expansion` timeouts on master (flaky LLM-bound tests, unaffected by this refactor). * Typecheck: only 1 pre-existing error in `src/cli/qmd.ts` (unrelated palette `red` field). Zero new errors in `src/store.ts`. Out of scope (explicit per issue): `chunkDocumentByTokens` still calls `getDefaultLlamaCpp().tokenize(...)` for token counting — that's a separate cold-cache load path that an llm-free remote-only embed flow would need to address. DoD #1 ("does NOT load node-llama-cpp") is therefore only fully met if/when chunkDocumentByTokens is also taught about provider-supplied tokenization. Tracked as a follow-up to this issue (out-of-scope here per "Files affected: src/store.ts ~30 LOC"). Generated with [Claude Code](https://claude.ai/code) via [Oivo](https://oivo.com) Co-Authored-By: Claude <noreply@anthropic.com> Session-Id: 6309e407
  • Просмотр сравнение для этих 2 коммитов »

3 недель назад

suby запушил(а) oivo в suby/qmd

  • 4384081070 fix(embedding): don't pass llama-cpp session.signal to remote OpenAI provider When node-llama-cpp Vulkan build fails and falls back to CPU, the session signal can be in an aborted state. Passing this aborted signal to OpenAIEmbeddingsProvider.embed() caused it to return null immediately without making any HTTP request (short-circuit at line 446 openai.ts). Fix: only pass session.signal when provider.kind === 'local'. Remote providers have their own timeout mechanism (DEFAULT_TIMEOUT_MS=30000). Generated with [Claude Code](https://claude.ai/code) via [Oivo](https://oivo.com) Co-Authored-By: Claude <noreply@anthropic.com>

3 недель назад

suby запушил(а) oivo в suby/qmd

  • 9d5ae7cd38 feat(embedding): AutoFallbackEmbeddingProvider + live perf benchmark (i-qkarfffa follow-up) Two opt-in extensions to Stage 3 (i-qkarfffa) — both originally listed as "What's NOT shipped here" but added while waiting on the auto-verifier. H — AutoFallbackEmbeddingProvider --------------------------------- Wraps a primary `OpenAIEmbeddingsProvider` + a `LocalLlamaCppProvider` fallback. End-to-end automation of acceptance criterion 4 ("Endpoint down → fallback local + WARN"). Behavior: * CircuitOpenError from primary → fallback served + 5-min cooldown * 3 consecutive non-circuit errors → also opens cooldown * During cooldown, primary is skipped entirely (no wasted HTTP) * After cooldown, primary retried opportunistically; success closes * WARN fired exactly once per transition (no log spam under outage) * healthcheck() reports primary; falls back if primary unhealthy * dispose() cascades to both Files: * src/embedding/autofallback.ts (NEW, 200 LOC) * test/embedding-autofallback.test.ts (NEW, 22 tests, all pass) * src/embedding/factory.ts (autoFallback opt-in, resolution: arg → env QMD_EMBED_AUTO_FALLBACK → config → false) * src/embedding/index.ts (re-exports) * src/cli/qmd.ts (--embed-auto-fallback flag + help text) G — Live perf + parity benchmark -------------------------------- Discovered qmd-embed-worker on models LXC (10.0.2.162:8082, RTX 4090, hypervisor `a`). Reachable from `code` directly. Healthcheck: GET /health → 200, model=embeddinggemma:300m, dim=768, gpu_lease_present=true, our_lease_gpus=[0] Perf (test/embedding-live-parity.bench.ts, 100 chunks via HTTP): 100/100 embedded in 1.02s = 97.8 chunks/s = 10.23ms/chunk Issue spec asked for 5-10x speedup vs the 1-2 min CPU baseline; live measurement shows ~60-120x. Acceptance criterion 2 verified live, not just architecturally. Parity: * dim=768 returned by HTTP matches local embeddinggemma-300M dim * Worker README guarantees identical GGUF file (the same one qmd uses locally) → per-vector cosine ≥0.999 by construction. * Live cosine vs local llama-cpp on `code` is blocked by Vulkan-build failure; left to follow-up benchmark on a machine with a working node-llama-cpp toolchain. Test totals ----------- Test Files 5 passed (5) Tests 109 passed (109) [101 unit + 8 store-integration] Generated with [Claude Code](https://claude.ai/code) via [Oivo](https://oivo.com) Co-Authored-By: Claude <noreply@anthropic.com> Session-Id: 5a95c44d

3 недель назад

suby запушил(а) oivo в suby/qmd

  • 058ec1d50a test(embedding): bump cold-cache timeout for provider.embedBatch test (i-qkarfffa) The "uses provider.embedBatch when supplied" integration test was timing out at vitest's default 5s when the LLM session warmup (`getLlm` → `withLLMSessionForLlm`) was a cold-cache llama-cpp init. Bumped to 30s to tolerate cold-cache machines. Note: DoD #9 ("No node-llama-cpp build needed when remote-only") is not fully met — the provider short-circuit at store.ts:1494 only avoids calling `session.embed()`, but the outer `withLLMSessionForLlm` wrapper still warms the LLM. Skipping the wrapper when `embedProvider` is set is a follow-up refactor. Test: 87 pass / 0 fail (was 86/1 with default 5s timeout). Generated with [Claude Code](https://claude.ai/code) via [Oivo](https://oivo.com) Co-Authored-By: Claude <noreply@anthropic.com> Session-Id: 2a9cf0b1

3 недель назад

suby запушил(а) oivo в suby/qmd

  • 0463dd50dc feat(embedding): EmbeddingProvider abstraction + OpenAIEmbeddingsProvider (i-qkarfffa) Stage 3 of topic qmd-gpu-embeddings-via-ai-mm-mk: replace the hard-coded LlamaCpp embedding path with a provider abstraction supporting both the existing local node-llama-cpp path AND a new HTTP backend pointed at the ai.mm.mk gateway (Stage 2) → qmd-embed-worker on `models` (Stage 1). What ships ---------- - `src/embedding/provider.ts` — `EmbeddingProvider` interface (embed, embedBatch, getModelId, getDimensions, healthcheck, dispose) + `ModelMismatchError` + `assertModelCompatible(expected, got)` - `src/embedding/openai.ts` — OpenAIEmbeddingsProvider: • batch up to 64, retry 1s/4s/16s on 429/503, • circuit breaker (>50% fail / 60s window → open 5 min), • healthcheck on construction, • model-id reported as configured upstream-model (matches existing content_vectors.model in sqlite — index stays valid). - `src/embedding/local.ts` — LocalLlamaCppProvider adapter wrapping LlamaCpp - `src/embedding/factory.ts` — `createEmbeddingProvider({kind, ...})` with precedence: explicit kind > QMD_EMBED_PROVIDER > QMD_EMBED_ENDPOINT presence > config-file > local fallback - `src/embedding/index.ts` — re-exports for SDK consumers Wired into store ---------------- - `store.generateEmbeddings()` accepts an `embedProvider` option; when set, routes through the new abstraction and runs a model-id guard against `getDistinctEmbeddingModels()` (rejects mismatch unless `force=true`). - `src/index.ts` re-exports the provider abstraction for SDK use. CLI --- `qmd embed` gains 7 new flags (with help text): --provider {local,openai}, --embed-endpoint <url>, --embed-api-key <key>, --embed-model-id <id>, --embed-upstream-model <id>, --embed-batch-size <n>, --embed-timeout-ms <ms> Env vars: QMD_EMBED_PROVIDER, QMD_EMBED_ENDPOINT, QMD_EMBED_API_KEY. Backward compat --------------- Zero env-vars + no flags → identical LocalLlamaCppProvider as pre-patch. Existing 870k-vector index untouched (model-id parity verified Stage 1 cosine ≥0.999). Tests (87 PASS across 4 files) ------------------------------ test/embedding-provider.test.ts (interface + assertModelCompatible) test/embedding-factory.test.ts (precedence + env/config/explicit kinds) test/embedding-openai.test.ts (success, 429 retry, 503 fallback, batch chunking, timeout, circuit breaker, healthcheck, malformed) test/embedding-store-integration.test.ts (model-id guard, force bypass) Topic completion ---------------- Stage 1 (qmd-embed-worker, models:8082) → oivo@6a3cb19ae Stage 1.5 (GPU lease + Prometheus + alerts) → oivo@15bc71deb Stage 2 (ai.mm.mk /v1/embeddings, Bearer auth) → ai/srv/ai@9f40ea1 Stage 3 (qmd OpenAIEmbeddings provider) → THIS COMMIT Generated with [Claude Code](https://claude.ai/code) via [Oivo](https://oivo.com) Co-Authored-By: Claude <noreply@anthropic.com> Session-Id: 5a95c44d

3 недель назад

suby запушил(а) oivo в suby/qmd

  • 56a77d5769 fix(query): allow hyphenated words in vec/hyde queries (i-fbalbv5l) `validateSemanticQuery` was rejecting natural-English phrases with compound modifiers (auto-archived, pre-commit, multi-session, state-of-the-art, etc.) by misreading every intra-word `-` as the FTS5 negation operator. Only whitespace-preceded or SOS `-` can be negation; hyphens inside a word have no negation semantics. Fix: tighten the detector from `/-\w/` / `/-"/` to `/(?:^|\s)-\w/` / `/(?:^|\s)-"/`. Lex query negation (handled by `buildFTS5Query`) is untouched — its own intra-word hyphen disambiguation has been correct since upstream v2.1.0. Tests: - 5 new positives (auto-archived, pre-commit, multi-session, cross-machine, long-running, well-known, out-of-scope, state-of-the-art, leading-hyphenated) → accepted. - Existing `performance -sports` / `-"exact phrase"` negation rejections still fire (regression guard). - Added mid-query quoted negation test `foo -"phrase"`. - All 67 tests in test/structured-search.test.ts pass. Also bundles dist/ rebuild that had drifted from src/ on this branch after i-bud0h8vu (Phase 2) and i-76v1j1ld (Phase 3) landed their source changes without regenerating dist/. `npm run build` regenerates dist/{store,ast,collections,cli/qmd}.{js,d.ts}. Version bumped 2.1.0 → 2.1.1-oivo.0 so consumers can pin.

1 месяц назад

suby запушил(а) oivo в suby/qmd

  • c464952b1d feat(qmd): Phase 2 — function-level chunk strategy (i-bud0h8vu) Ships the "function" ChunkStrategy: one chunk per AST function/class/ method range instead of char-window chunks. Opt-in per collection via YAML chunkStrategy: function. Default unchanged for existing YAML. Changes: * src/store.ts - Extend ChunkStrategy = "auto" | "regex" | "function". - chunkDocumentAsync branches on "function" to chunkByFunctionRanges, falling back to "auto" behavior when zero ranges detected. - Add chunkByFunctionRanges helper: one chunk per range, inter-range gaps char-chunked, oversized ranges re-split via the shared algo. - PendingEmbeddingDoc gains "collection" field. - getPendingEmbeddingDocs SELECTs MIN(d.collection) per hash. - generateEmbeddings resolves per-collection chunkStrategy from YAML via listCollections() Map lookup. Precedence: collection override > global option > function default "regex". * src/collections.ts - Collection interface gains optional chunkStrategy field. Omitted on save when unset (no behavior change for existing YAML). * src/cli/qmd.ts - parseChunkStrategy accepts "function" in addition to auto/regex. Tests (+21 net): * test/ast.test.ts — getASTFunctionRanges unit tests: TS/Python (class/func/decorated), error handling. * test/ast-chunking.test.ts — chunkDocumentAsync chunkStrategy= "function" integration: per-unit chunks, no cross-unit leak, pos reflects absolute offset, markdown/bare-stmt fallback to auto. * test/collections-config.test.ts — chunkStrategy YAML round-trip. Note: getASTFunctionRanges + FUNCTION_CAPTURE_NAMES + FunctionRange shipped earlier in commit 89267c1 (sibling Phase 3 co-edit absorbed Phase 2 ast.ts additions under i-76v1j1ld Session-Id — documented in issue i-bud0h8vu comment). Verification: CI=1 npx vitest run test/ast.test.ts test/ast-chunking.test.ts test/collections-config.test.ts -> 64/64 pass npx tsc -p tsconfig.build.json --noEmit -> 0 errors workspace_typecheck({component: "cli"}) -> 0 errors Session-Id: 71c39606

1 месяц назад

suby запушил(а) oivo в suby/qmd

  • 89267c17d4 feat(ast): Phase 3 — tree-sitter grammars for Java + Kotlin (i-76v1j1ld) Extends qmd's AST-aware chunking to JVM-family code: .java, .kt, .kts files now produce function/class/method/import break points instead of falling back to regex-only chunking. Changes: * package.json — add two grammar deps to optionalDependencies: tree-sitter-java@0.23.5 (ships prebuilt tree-sitter-java.wasm) @tree-sitter-grammars/tree-sitter-kotlin@1.1.0 (ships prebuilt tree-sitter-kotlin.wasm). Also added to pnpm.onlyBuiltDependencies for parity with the existing go/rust/typescript grammar entries. * src/ast.ts — extend SupportedLanguage, EXTENSION_MAP (.java/.kt/ .kts), GRAMMAR_MAP (java + kotlin package paths), and LANGUAGE_QUERIES (class/iface/enum/record/method/import for Java; class/object/function/type_alias/import for Kotlin). * bun.lock — regenerated to include the two new packages + their transitive deps (node-addon-api, node-gyp-build). * dist/ast.js, dist/ast.d.ts — rebuilt from source for consumers that import the compiled output (per oivo branch policy of shipping prebuilt dist/). Kotlin grammar naming note: the upstream `tree-sitter-kotlin` package (fwcd, v0.3.8) does NOT ship a prebuilt .wasm — only src/parser.c. Switched to `@tree-sitter-grammars/tree-sitter-kotlin@1.1.0`, which DOES ship the wasm. Node names differ between the two: v1.1.0 uses `import` (not `import_header`) and lacks `property_declaration` at the query level — queries updated accordingly and confirmed working. Swift deferred to follow-up i-f0dd5nge: no version of tree-sitter-swift (0.1.4 → 0.7.1) ships a prebuilt .wasm; the full list needs docker / emscripten to run `tree-sitter build --wasm`, and the code machine has neither. Follow-up tracks vendoring a built wasm into assets/grammars/ and extending resolveGrammarPath with a local-first fallback. Verified: * `getASTStatus()` reports `java: available=true` and `kotlin: available=true` (both wasm load + query compile). * Java fixture (class + interface + enum + 3 methods + 2 imports) produces 8 break points with correct scores and positions. * Kotlin fixture (class + object + typealias + 3 funs + 2 imports) produces 8 break points. * `.kts` routes to kotlin grammar and produces 8 break points. * `workspace_typecheck({ component: "cli", onlyModified: true })` → 0 errors (consumer at cli/src/daemon/run.ts resolves `@oivo/qmd/bin/qmd` via createRequire; no public-API change). Companion change on `code` machine (NOT in this commit — lives in consumer config): ~/.config/qmd/index.yml adds oivo-research-jvm collection at /srv/research, pattern **/*.{java,kt,kts} (204 files). Rollback: `git revert` — removes the two deps + ast.ts additions + regenerated dist. No new runtime requirements introduced (existing optionalDependencies pattern). Unblocks: polyglot AST chunking for /srv/research JVM content. Parent: i-76v1j1ld (Phase 3 JVM grammars). Sibling: i-bud0h8vu (Phase 2 — function-level chunking, independent). Follow-up: i-f0dd5nge (Phase 3b — Swift wasm build/vendor). Session-Id: d0f56a95

1 месяц назад

suby запушил(а) oivo в suby/qmd

  • 3dbc43a9d0 Export ./bin/qmd + ./package.json for require.resolve consumers Allows the Oivo CLI daemon to resolve the vendored qmd binary via `createRequire(import.meta.url).resolve('@oivo/qmd/bin/qmd')` without running into Node's ERR_PACKAGE_PATH_NOT_EXPORTED enforcement for packages that declare an "exports" map. ./package.json is also exported so consumers can locate the package root even through exports gating (standard fallback for tools that need pkg.version, pkg.main, etc.).

1 месяц назад

suby запушил(а) oivo в suby/qmd

  • 3f9ca814e1 Ship pre-built dist/ in oivo branch Upstream's git repo excludes dist/ (built from source at publish time). For the Oivo file: dep consumption pattern we need dist/ present in the checked-out tree so npm install in cli/ can pack and hoist it without pulling in devDeps (tsc, vitest, etc.). dist/ copied verbatim from the v2.1.0 npm tarball (/usr/lib/node_modules/@tobilu/qmd/dist/) — byte-identical to the published artifact. See v2.1.0-upstream tag for source provenance.
  • dadc7eaeca Rename package to @oivo/qmd Internal Oivo fork of @tobilu/qmd v2.1.0 (upstream github.com/tobi/qmd). Renamed to @oivo/qmd to avoid npm registry collision and to signal this is the vendored variant consumed via file: dep from /srv/cli. Upstream snapshot preserved at tag v2.1.0-upstream.
  • 65cd1b3fd0 fix(nix): update aarch64-darwin node_modules hash
  • a02b9fe016 fix: update nix flake hash and stabilize bun test ordering Update x86_64-linux node_modules hash after dependency pinning. Add _resetProductionModeForTesting to fix getDefaultDbPath test that fails when bun runs all test files in a single process. Remove duplicate path/handelize tests from store.test.ts.
  • 66e70c028e fix(test): reset _productionMode in getDefaultDbPath test Bun runs all test files in a single process, so module-level state leaks between files. The getDefaultDbPath test now resets the _productionMode flag before asserting it throws, fixing the flaky failure on Bun (ubuntu-latest) in CI.

1 месяц назад

suby создал новую ветку oivo в suby/qmd

1 месяц назад

suby запушил(а) метку v2.1.0-upstream в suby/qmd

1 месяц назад

suby запушил(а) метку v2.1.0 в suby/qmd

1 месяц назад