Node.js guide

Offline-First Search in Node.js with altor-vec

Use altor-vec to add offline-first search to your Node.js app — entirely in the browser, with no server, no API keys, and zero per-query cost. Build search that works without a network connection — cache the vector index in IndexedDB and serve search entirely from browser storage, enabling PWAs and offline-first apps to maintain full search capability offline.

Install: npm install altor-vec @xenova/transformers

Implementation

Server-side indexing script (Node 18+, ESM). Uses module-level variable for the engine.

// build-offline-index.mjs — Node.js build script for offline-first PWA
// Generates a search index that the browser can cache and use offline
import { pipeline } from '@xenova/transformers';
import init, { WasmSearchEngine } from 'altor-vec';
import { readFileSync, writeFileSync, statSync } from 'fs';
import { gzipSync } from 'zlib';

const docs = JSON.parse(readFileSync('data/docs.json', 'utf8'));
console.log(\`Building offline search index for \${docs.length} documents...\`);

await init();
const embedder = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
const DIM = 384;
const vectors = new Float32Array(docs.length * DIM);

for (const [i, doc] of docs.entries()) {
  const out = await embedder(\`\${doc.title} \${doc.content}\`,
    { pooling: 'mean', normalize: true });
  vectors.set(out.data, i * DIM);
}

const engine = WasmSearchEngine.from_vectors(vectors, DIM, 16, 200, 50);
const indexJson = engine.to_json();

// Write plain JSON (for browsers that load it via fetch)
writeFileSync('public/offline-search-index.json', indexJson);

// Write gzipped version (serve with Content-Encoding: gzip to reduce download size)
writeFileSync('public/offline-search-index.json.gz', gzipSync(indexJson));

const size = statSync('public/offline-search-index.json').size;
const gzSize = statSync('public/offline-search-index.json.gz').size;
console.log(\`Index: \${(size/1024).toFixed(0)}KB plain, \${(gzSize/1024).toFixed(0)}KB gzipped\`);
console.log('Add to service worker cache: /offline-search-index.json');

// service-worker.js (add to your SW):
// const CACHE_NAME = 'search-v1';
// const PRECACHE = ['/offline-search-index.json', '/search-metadata.json'];
// self.addEventListener('install', e => e.waitUntil(
//   caches.open(CACHE_NAME).then(c => c.addAll(PRECACHE))
// ));

Performance

Load from IndexedDB: ~50–200ms. Search: <1ms. Zero network dependency after first load. Measured on M2 MacBook Pro, Chrome 124. Mobile is typically 2–4× slower — test on target devices before deploying.

Index sizeDimensionsQuery p50Memory
1,000 vectors384~0.1ms~2MB
10,000 vectors384~0.4ms~17MB
50,000 vectors384~0.9ms~85MB

When this approach works best

Limitations

Frequently asked questions

How do I detect when the user is offline and fall back to cached search?

Use navigator.onLine and the 'online'/'offline' window events. When offline, load the index from IndexedDB. When online, optionally fetch a fresher index. In a service worker, cache the index file with a cache-first strategy.

What is the maximum index size I can store in IndexedDB?

IndexedDB storage limits are browser- and device-dependent: Chrome allows up to ~60% of available disk space, Firefox up to 50%. In practice, a 50K-document index at 384 dimensions (~85MB JSON) is the practical upper limit for reliable cross-device support.

How do I use altor-vec in a service worker for offline search?

altor-vec WASM runs in service workers. Import altor-vec in your service worker, cache the index JSON with a cache-first strategy, and respond to search fetch events by loading the cached index and running engine.search() in the service worker context.

Related resources

framework

use case

reference