benchmark comparison
altor-vec vs FAISS WASM
Research-grade ANN lineage versus a small production-friendly browser package.
FAISS carries a strong reputation because of its research and systems lineage — it was developed at Meta AI and underpins many large-scale retrieval systems. The tradeoff is that reputation does not automatically translate into the smallest or easiest frontend package for browser delivery. FAISS WASM ports the library to WebAssembly, but the full FAISS surface area comes with it.
Comparison table
| Category | altor-vec | FAISS WASM |
|---|---|---|
| Runtime model | Browser-side WASM HNSW for shipped app experiences. | WebAssembly adaptation of a famously capable ANN toolkit. |
| Bundle size / delivery | ~54KB gzipped representative payload. | Often significantly larger — FAISS includes many algorithms and the compiled WASM binary reflects that breadth. |
| Query latency | ~0.4ms p50 at 10K/384d. Fast local retrieval with modest startup cost. | Potentially strong raw performance, but browser packaging and startup overhead can matter more in product UX. |
| Memory usage | Designed for browser-scale corpora — ~17MB for 10K/384d. | Memory behavior depends on the chosen FAISS index strategy and compiled target. |
| Features | Focused ANN search and serialization. | Much broader family of ANN approaches: IVF, PQ, IVFPQ, Flat, and more. |
| Dataset sweet spot | Browser-bundled corpora where operational simplicity matters. | Teams that need FAISS-style breadth or experimentation with multiple index types. |
Where altor-vec wins
- Much easier payload story for web delivery — 54KB gzipped versus a substantially larger FAISS WASM binary.
- Narrower API reduces integration overhead and learning curve.
- Better match when product teams want 'small and sufficient' rather than maximum breadth.
- Faster cold start because the WASM module is smaller to download and initialize.
- Purpose-built for the browser delivery model, not a port of a native library.
Where FAISS WASM wins
- Broader algorithmic toolbox: IVF, PQ, IVFPQ, and other index families not in altor-vec.
- Potentially better fit for advanced experimentation and non-HNSW approaches.
- Stronger appeal to teams already familiar with FAISS workflows from Python or C++.
- More tuning options for specialists who need fine-grained control over index construction.
Honest decision guide
FAISS wins on breadth and lineage. altor-vec wins on browser pragmatism and small-package delivery.
The honest pattern across all of these benchmark pages is simple: if the search corpus should stay on the server, choose server-oriented infrastructure. If the search corpus is intentionally shipped with the product and the UX benefit of local retrieval matters more than backend scale, altor-vec is usually the more natural fit.
For most product teams, the decision comes down to this: do you need multiple index types and deep ANN tuning in the browser? If not, altor-vec's focused HNSW implementation is smaller, simpler, and sufficient. If you are building a research tool or need IVF-style quantization for memory efficiency on very large in-browser corpora, FAISS WASM may be worth the larger bundle cost.
Benchmark methodology
These benchmarks measure query latency for approximate nearest-neighbor search in a controlled browser environment. All altor-vec measurements run in Chrome 124 on M2 MacBook Pro, using a pre-built HNSW index loaded from JSON. No embedding generation time is included — we measure pure retrieval latency.
Test configuration
| Parameter | Value |
|---|---|
| Index size | 10,000 vectors |
| Vector dimensions | 384 (all-MiniLM-L6-v2 output) |
| HNSW M | 16 |
| ef_construction | 200 |
| ef_search | 50 |
| k (neighbors returned) | 5 |
| Browser | Chrome 124, M2 MacBook Pro |
| Measurement | p50 and p95 of 1,000 consecutive queries |
altor-vec latency results
| Metric | Result |
|---|---|
| p50 query latency | 0.4ms |
| p95 query latency | 0.8ms |
| p99 query latency | 1.2ms |
| Index load time (10K vectors) | ~35ms (JSON parse + WASM init) |
| Index memory footprint | ~17MB (10K × 384d) |
| WASM bundle size | 54KB gzipped |
What these numbers mean for your app
A p50 of 0.4ms and p95 of 0.8ms means that for a typical 10,000-document index, search is effectively instant from the user's perspective. Human perception of "instantaneous" begins around 100ms. At sub-millisecond latency, the bottleneck is rendering results, not computing them.
For comparison, network-dependent search (any cloud API) adds a baseline of 20–150ms for the round-trip, before the server executes its own query. At 100ms total, a cloud search query takes 125× longer than an altor-vec local query at p95. Whether that matters depends entirely on your product — for autocomplete-as-you-type, the difference is significant; for triggered search (user presses Enter), it is less critical.
The 17MB memory footprint for 10K vectors at 384 dimensions fits comfortably in modern browser memory budgets. Most consumer devices have 4-8GB of RAM available to browser tabs. Practical limits are typically higher than 10K vectors for most documentation and product catalog use cases. For 100K vectors at 384 dimensions, expect approximately 170MB — viable for desktop but worth testing on mobile.
Running your own benchmark
import init, { WasmSearchEngine } from 'altor-vec';
await init();
// Build index
const vectors = new Float32Array(N * DIM); // your embeddings
const engine = WasmSearchEngine.from_vectors(vectors, DIM, 16, 200, 50);
// Benchmark query latency
const query = new Float32Array(DIM); // your query embedding
const iterations = 1000;
const times = [];
for (let i = 0; i < iterations; i++) {
const start = performance.now();
engine.search(query, 5);
times.push(performance.now() - start);
}
times.sort((a, b) => a - b);
console.log('p50:', times[Math.floor(iterations * 0.5)].toFixed(2) + 'ms');
console.log('p95:', times[Math.floor(iterations * 0.95)].toFixed(2) + 'ms');
console.log('p99:', times[Math.floor(iterations * 0.99)].toFixed(2) + 'ms');
Run this in your browser console against your own index to get accurate numbers for your specific hardware, vector dimensions, and index size.
FAQ
Is FAISS WASM always better because FAISS is famous?
Not for frontend delivery. Browser products care about payload and integration overhead just as much as raw ANN credibility. A 54KB bundle that loads in ~35ms is more useful than a larger bundle with more algorithms you do not need.
When should I still pick FAISS WASM?
When you specifically need its broader algorithm family — IVF, PQ, IVFPQ — or are already invested in FAISS-style workflows from Python development. Also if you need quantization for memory efficiency on very large in-browser datasets.
What is altor-vec optimizing for instead?
A focused, lightweight browser experience with minimal bundle cost. altor-vec implements HNSW only — the same algorithm used by Pinecone, Weaviate, and Qdrant — and optimizes for the smallest possible JavaScript package with the fastest cold start in a browser context.
How was this benchmark run?
All altor-vec measurements run in Chrome 124 on M2 MacBook Pro using a pre-built HNSW index with 10,000 vectors at 384 dimensions. Latency is the p50 and p95 of 1,000 consecutive queries using performance.now(). No embedding generation time is included.
Does altor-vec performance degrade with more vectors?
Yes, but gradually. HNSW scales logarithmically, so doubling the index size increases query latency by roughly 10–15%, not 2×. A 100K-vector index at 384 dimensions delivers approximately 1.2ms p50 versus 0.4ms for 10K vectors.
How does browser WASM performance compare to native C++ for vector search?
Browser WASM is typically 3–5× slower than native C++ for HNSW search. However, it eliminates the network round-trip entirely. Even at 5× overhead, 2ms WASM beats 20–150ms cloud API latency by 10–75× for read-only workloads on static corpora.
Get started: npm install altor-vec · GitHub