Node.js guide

Product Search in Node.js with altor-vec

Use altor-vec to add product search to your Node.js app — entirely in the browser, with no server, no API keys, and zero per-query cost. Search a product catalog by semantic meaning — find products by concept, synonym, or intent rather than requiring exact keyword matches.

Install: npm install altor-vec @xenova/transformers

Implementation

Server-side indexing script (Node 18+, ESM). Uses module-level variable for the engine.

// build-product-index.mjs — Node.js build script for product catalog
// Run: node build-product-index.mjs
import { pipeline } from '@xenova/transformers';
import init, { WasmSearchEngine } from 'altor-vec';
import { readFileSync, writeFileSync } from 'fs';

// Load products from JSON (Shopify export, Stripe products, etc.)
const products = JSON.parse(readFileSync('data/products.json', 'utf8'));

await init();
const embedder = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
const DIM = 384;
const vectors = new Float32Array(products.length * DIM);

console.log(\`Building product search index for \${products.length} products...\`);
for (const [i, p] of products.entries()) {
  // Combine name + description + category for richer embeddings
  const text = \`\${p.name}. \${p.description}. Category: \${p.category}.\`;
  const out = await embedder(text, { pooling: 'mean', normalize: true });
  vectors.set(out.data, i * DIM);
}

const engine = WasmSearchEngine.from_vectors(vectors, DIM, 16, 200, 50);
writeFileSync('public/product-search-index.json', engine.to_json());
writeFileSync('public/products-metadata.json', JSON.stringify(
  products.map(p => ({ id: p.id, name: p.name, price: p.price, category: p.category }))
));
console.log('Product search index ready.');

// package.json: add "build:search": "node build-product-index.mjs" to scripts
// Run before build: npm run build:search && npm run build

Performance

50K products at 384 dimensions: ~85MB memory, ~1ms per query. Measured on M2 MacBook Pro, Chrome 124. Mobile is typically 2–4× slower — test on target devices before deploying.

Index sizeDimensionsQuery p50Memory
1,000 vectors384~0.1ms~2MB
10,000 vectors384~0.4ms~17MB
50,000 vectors384~0.9ms~85MB

When this approach works best

Limitations

Frequently asked questions

How do I filter by category or price after a semantic search?

Run engine.search(queryEmbedding, 50) to get 50 candidates, then filter the results array by category, price range, or in-stock status in JavaScript before showing the top N to the user. This is called post-retrieval filtering.

Will semantic search understand synonyms like 'sneakers' vs 'trainers'?

Yes. Embedding models encode semantic meaning, so 'sneakers', 'trainers', 'running shoes', and 'athletic footwear' will all map to nearby vector positions and return similar results.

How do I generate embeddings for product titles and descriptions?

Concatenate the product name and description: `${product.name}. ${product.description}`. Embed this combined string with all-MiniLM-L6-v2 via Transformers.js. This gives better results than embedding the title alone.

Related resources

framework

use case

reference