benchmark comparison

altor-vec vs ChromaDB

Q: How does browser WASM performance compare to native C++ for vector search?

Browser WASM is typically 3-5x slower than native C++ for HNSW search. However, it eliminates the network round-trip entirely, making browser WASM faster than any cloud API for read-only workloads.

Application database for embeddings versus static browser retrieval.

ChromaDB is often chosen as an application-side store for embeddings and metadata, while altor-vec is closer to a shipped frontend primitive. Comparing them usefully means being honest about that mismatch. ChromaDB is a running service; altor-vec is a JavaScript dependency that ships with your app.

These numbers are representative, not universal. Bundle size, query latency, and memory usage all vary with vector dimensions, index parameters, browser runtime, hardware, and whether embeddings are generated on device or ahead of time.

Comparison table

Category	altor-vec	ChromaDB
Runtime model	Local browser ANN runtime.	Server-side application database or local dev database workflow.
Bundle size / delivery	~54KB gzipped plus index asset.	No browser ANN payload, but the app depends on a running service or server environment.
Query latency	Sub-millisecond local lookup after load. ~0.4ms p50 at 10K/384d.	Server or local process latency plus request overhead depending on deployment model.
Memory usage	Client memory consumed by shipped vectors (~17MB for 10K/384d).	Memory and storage handled in the application backend or local environment.
Features	ANN retrieval and serialization.	Collections, metadata, persistence, and app-database style workflows.
Dataset sweet spot	Best for static corpora that belong in the shipped frontend.	Better for mutable app datasets and server-managed retrieval.
Typical use case	Docs search, product catalog, help center, offline app.	Backend RAG, LLM memory, prototype vector storage, local dev.

Where altor-vec wins

No search server to run or pay for.
Excellent fit for embedded widgets, docs, and offline experiences.
Queries can stay fully local — no egress, no API calls.
Works with no Python runtime, no Docker, and no server.
Ships as a single npm package with zero infrastructure dependencies.

Where ChromaDB wins

Much better fit for mutable datasets and application storage patterns.
More natural for backend RAG stacks where vectors change frequently.
Easier to centralize data instead of shipping it.
Built-in persistence, collections, and metadata filtering.
Python-native API that integrates naturally with LLM frameworks like LangChain and LlamaIndex.

Honest decision guide

ChromaDB is stronger as an application data layer. altor-vec is stronger as a browser-delivered retrieval primitive.

The honest pattern across all of these benchmark pages is simple: if the search corpus should stay on the server, choose server-oriented infrastructure. If the search corpus is intentionally shipped with the product and the UX benefit of local retrieval matters more than backend scale, altor-vec is usually the more natural fit.

The clearest signal: if you are building a Python backend with an LLM, ChromaDB fits. If you are building a JavaScript frontend with a static corpus, altor-vec fits. The overlap is narrow — the tools simply solve different problems at different layers of the stack.

Benchmark methodology

These benchmarks measure query latency for approximate nearest-neighbor search in a controlled browser environment. All altor-vec measurements run in Chrome 124 on M2 MacBook Pro, using a pre-built HNSW index loaded from JSON. No embedding generation time is included — we measure pure retrieval latency.

Test configuration

Parameter	Value
Index size	10,000 vectors
Vector dimensions	384 (all-MiniLM-L6-v2 output)
HNSW M	16
ef_construction	200
ef_search	50
k (neighbors returned)	5
Browser	Chrome 124, M2 MacBook Pro
Measurement	p50 and p95 of 1,000 consecutive queries

altor-vec latency results

Metric	Result
p50 query latency	0.4ms
p95 query latency	0.8ms
p99 query latency	1.2ms
Index load time (10K vectors)	~35ms (JSON parse + WASM init)
Index memory footprint	~17MB (10K × 384d)
WASM bundle size	54KB gzipped

What these numbers mean for your app

A p50 of 0.4ms and p95 of 0.8ms means that for a typical 10,000-document index, search is effectively instant from the user's perspective. Human perception of "instantaneous" begins around 100ms. At sub-millisecond latency, the bottleneck is rendering results, not computing them.

For comparison, network-dependent search (any cloud API) adds a baseline of 20–150ms for the round-trip, before the server executes its own query. At 100ms total, a cloud search query takes 125× longer than an altor-vec local query at p95. Whether that matters depends entirely on your product — for autocomplete-as-you-type, the difference is significant; for triggered search (user presses Enter), it is less critical.

The 17MB memory footprint for 10K vectors at 384 dimensions fits comfortably in modern browser memory budgets. Most consumer devices have 4-8GB of RAM available to browser tabs. Practical limits are typically higher than 10K vectors for most documentation and product catalog use cases. For 100K vectors at 384 dimensions, expect approximately 170MB — viable for desktop but worth testing on mobile.

Running your own benchmark

import init, { WasmSearchEngine } from 'altor-vec';
await init();

// Build index
const vectors = new Float32Array(N * DIM); // your embeddings
const engine = WasmSearchEngine.from_vectors(vectors, DIM, 16, 200, 50);

// Benchmark query latency
const query = new Float32Array(DIM); // your query embedding
const iterations = 1000;
const times = [];

for (let i = 0; i < iterations; i++) {
  const start = performance.now();
  engine.search(query, 5);
  times.push(performance.now() - start);
}

times.sort((a, b) => a - b);
console.log('p50:', times[Math.floor(iterations * 0.5)].toFixed(2) + 'ms');
console.log('p95:', times[Math.floor(iterations * 0.95)].toFixed(2) + 'ms');
console.log('p99:', times[Math.floor(iterations * 0.99)].toFixed(2) + 'ms');

Run this in your browser console against your own index to get accurate numbers for your specific hardware, vector dimensions, and index size.

FAQ

Can altor-vec replace ChromaDB in a backend RAG system?

Usually no. ChromaDB is better suited to server-managed mutable datasets and backend workflows. If your RAG system runs in Python and updates embeddings frequently, ChromaDB is the natural choice. altor-vec is designed for static or infrequently updated corpora that ship with JavaScript apps.

When is altor-vec clearly better?

When the corpus is safe to ship and the retrieval experience belongs directly in the frontend. Documentation sites, product help centers, and offline-first apps are ideal. The corpus is typically built once at deploy time and shipped as a JSON index file alongside your app bundle.

Why compare them at all?

Because teams sometimes conflate 'vector search tool' categories even though the operational models are very different. ChromaDB is a database; altor-vec is a retrieval primitive. Both do vector search, but one requires a server and the other runs in a browser tab.

How was this benchmark run?

All altor-vec measurements run in Chrome 124 on M2 MacBook Pro using a pre-built HNSW index with 10,000 vectors at 384 dimensions. Latency is the p50 and p95 of 1,000 consecutive queries using performance.now(). No embedding generation time is included.

Does altor-vec performance degrade with more vectors?

Yes, but gradually. HNSW scales logarithmically, so doubling the index size increases query latency by roughly 10–15%, not 2×. A 100K-vector index at 384 dimensions delivers approximately 1.2ms p50 versus 0.4ms for 10K vectors.

How does browser WASM performance compare to native C++ for vector search?

Browser WASM is typically 3–5× slower than native C++ for HNSW search. However, it eliminates the network round-trip entirely. Even at 5× overhead, 2ms WASM beats 20–150ms cloud API latency by 10–75× for read-only workloads on static corpora.

Get started: npm install altor-vec · GitHub