Node.js guide

Product Search in Node.js with altor-vec

Use altor-vec to add product search to your Node.js app — entirely in the browser, with no server, no API keys, and zero per-query cost. Search a product catalog by semantic meaning — find products by concept, synonym, or intent rather than requiring exact keyword matches.

Install: npm install altor-vec @xenova/transformers

Implementation

Server-side indexing script (Node 18+, ESM). Uses module-level variable for the engine.

// build-product-index.mjs — Node.js build script for product catalog
// Run: node build-product-index.mjs
import { pipeline } from '@xenova/transformers';
import init, { WasmSearchEngine } from 'altor-vec';
import { readFileSync, writeFileSync } from 'fs';

// Load products from JSON (Shopify export, Stripe products, etc.)
const products = JSON.parse(readFileSync('data/products.json', 'utf8'));

await init();
const embedder = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
const DIM = 384;
const vectors = new Float32Array(products.length * DIM);

console.log(\`Building product search index for \${products.length} products...\`);
for (const [i, p] of products.entries()) {
  // Combine name + description + category for richer embeddings
  const text = \`\${p.name}. \${p.description}. Category: \${p.category}.\`;
  const out = await embedder(text, { pooling: 'mean', normalize: true });
  vectors.set(out.data, i * DIM);
}

const engine = WasmSearchEngine.from_vectors(vectors, DIM, 16, 200, 50);
writeFileSync('public/product-search-index.json', engine.to_json());
writeFileSync('public/products-metadata.json', JSON.stringify(
  products.map(p => ({ id: p.id, name: p.name, price: p.price, category: p.category }))
));
console.log('Product search index ready.');

// package.json: add "build:search": "node build-product-index.mjs" to scripts
// Run before build: npm run build:search && npm run build

Performance

50K products at 384 dimensions: ~85MB memory, ~1ms per query. Measured on M2 MacBook Pro, Chrome 124. Mobile is typically 2–4× slower — test on target devices before deploying.

Index size	Dimensions	Query p50	Memory
1,000 vectors	384	~0.1ms	~2MB
10,000 vectors	384	~0.4ms	~17MB
50,000 vectors	384	~0.9ms	~85MB

When this approach works best

Static e-commerce catalogs (Shopify/Stripe products exported as JSON)
Apps where search must work offline or with zero API budget
Product discovery UIs where semantic matching improves conversion

Limitations

No built-in faceted filtering (category, price range) — implement with post-filter on results
Index updates require a rebuild step — not suitable for catalogs that change hourly

Frequently asked questions

How do I filter by category or price after a semantic search?

Run engine.search(queryEmbedding, 50) to get 50 candidates, then filter the results array by category, price range, or in-stock status in JavaScript before showing the top N to the user. This is called post-retrieval filtering.

Will semantic search understand synonyms like 'sneakers' vs 'trainers'?

Yes. Embedding models encode semantic meaning, so 'sneakers', 'trainers', 'running shoes', and 'athletic footwear' will all map to nearby vector positions and return similar results.

How do I generate embeddings for product titles and descriptions?

Concatenate the product name and description: `${product.name}. ${product.description}`. Embed this combined string with all-MiniLM-L6-v2 via Transformers.js. This gives better results than embedding the title alone.

Related resources

framework

use case

reference