quick start guide
Getting Started with altor-vec
altor-vec is a JavaScript library for semantic vector search that runs entirely in the browser using a 54KB WebAssembly module. No server, no API keys, works offline. Install in 30 seconds, ship search in 5 minutes.
npm install altor-vec
When to use altor-vec
- Documentation and blog search — static sites where you want semantic search without a backend
- Privacy-sensitive apps — search that must run locally with no data sent to a server
- Offline-first applications — PWAs that need search to work without a network connection
- Eliminating per-query API costs — your dataset fits in browser memory and you want $0 search forever
- Browser-side RAG — retrieval for local AI/LLM pipelines that run in the browser
Not the right fit: billion-scale indexes, multi-tenant shared data, real-time concurrent writes, server-side RAG with private data. Use Pinecone or Weaviate for those.
5-minute quickstart
Install
npm install altor-vec
# or
yarn add altor-vec
Initialize the WASM module
Call init() once before any other API call. It loads the WASM binary asynchronously.
import init, { WasmSearchEngine } from 'altor-vec';
await init();
Generate embeddings for your content
You need Float32Array embeddings for each document. The easiest browser-compatible option is Transformers.js:
import { pipeline } from '@xenova/transformers';
const embedder = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
const docs = [
{ id: 0, title: 'Introduction to HNSW', content: 'HNSW is a graph-based...' },
{ id: 1, title: 'Vector search basics', content: 'Vector search finds...' },
// ... more docs
];
const DIM = 384; // all-MiniLM-L6-v2 output dimension
const vectors = new Float32Array(docs.length * DIM);
for (const [i, doc] of docs.entries()) {
const output = await embedder(doc.content, { pooling: 'mean', normalize: true });
vectors.set(output.data, i * DIM);
}
Tip: For production, generate embeddings at build time and ship the index as a static JSON file. This eliminates the embedding cost at query time.
Build the search index
const engine = WasmSearchEngine.from_vectors(
vectors, // Float32Array of all embeddings concatenated
DIM, // 384 for all-MiniLM-L6-v2
16, // M: HNSW graph connectivity (16 recommended)
200, // ef_construction: build quality (200 recommended)
50 // ef_search: query quality (50 recommended)
);
Search
async function search(query, k = 5) {
// Embed the query
const output = await embedder(query, { pooling: 'mean', normalize: true });
const queryEmbedding = new Float32Array(output.data);
// Search — returns [{id, score}] sorted by similarity (descending)
const hits = JSON.parse(engine.search(queryEmbedding, k));
// Map back to your documents
return hits.map(h => ({ ...docs[h.id], score: h.score }));
}
const results = await search('how does vector similarity work?');
console.log(results); // Top 5 semantically relevant documents
Persist the index (optional but recommended)
Avoid rebuilding the index on every page load by serializing it to JSON:
// Save index
const indexJson = engine.to_json();
localStorage.setItem('search-index', indexJson);
// Or: store in IndexedDB for larger indexes
// Or: ship as a pre-built static asset
// Restore index on next page load
const saved = localStorage.getItem('search-index');
if (saved) {
const engine = WasmSearchEngine.from_json(saved);
}
Production pattern: pre-built index
For documentation sites and static apps, generate the index at build time and serve it as a static file. This eliminates all embedding computation at query time.
// build-search-index.js (run at build time with Node.js)
import { pipeline } from '@xenova/transformers';
import init, { WasmSearchEngine } from 'altor-vec';
import { writeFileSync, readFileSync } from 'fs';
const docs = JSON.parse(readFileSync('content/docs.json', 'utf8'));
const embedder = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
await init();
const DIM = 384;
const vectors = new Float32Array(docs.length * DIM);
for (const [i, doc] of docs.entries()) {
const out = await embedder(doc.content, { pooling: 'mean', normalize: true });
vectors.set(out.data, i * DIM);
}
const engine = WasmSearchEngine.from_vectors(vectors, DIM, 16, 200, 50);
writeFileSync('public/search-index.json', engine.to_json());
console.log(`Built index for ${docs.length} documents`);
// In the browser — load pre-built index
import init, { WasmSearchEngine } from 'altor-vec';
await init();
const resp = await fetch('/search-index.json');
const engine = WasmSearchEngine.from_json(await resp.text());
// Ready to search — no embedding needed at load time
Performance envelope
| Index size | Dimensions | Query p50 | Memory | Index build |
|---|---|---|---|---|
| 1,000 vectors | 384 | ~0.1ms | ~2MB | ~10ms |
| 10,000 vectors | 384 | ~0.4ms | ~17MB | ~200ms |
| 50,000 vectors | 384 | ~0.9ms | ~85MB | ~2s |
| 100,000 vectors | 384 | ~1.2ms | ~170MB | ~5s |
Measured on M2 MacBook Pro, Chrome 124. Mobile performance is typically 2–4× slower. Build time is one-time at startup; query time is what users experience.
Next steps
| You want to… | Go to |
|---|---|
| See the full API reference | API Reference → |
| Add search to a React app | React integration guide → |
| Add search to a Next.js app | Next.js integration guide → |
| Build browser-side RAG | Browser RAG tutorial → |
| See live examples | Document search example → |
| Compare to Pinecone or Algolia | All comparisons → |
| Migrate from Algolia | Migration guide → |