benchmark comparison

altor-vec vs Weaviate

Q: How does browser WASM performance compare to native C++ for vector search?

Browser WASM is typically 3-5x slower than native C++ for HNSW search. However, it eliminates the network round-trip entirely, making it 50-375x faster than cloud API latency for read-only browser workloads.

Feature-rich vector database versus lightweight client-side HNSW.

Weaviate brings database-style workflows, modules, and server infrastructure. altor-vec is far narrower, but that narrowness is exactly why it can fit inside a web bundle. The question is not which has more features — it is which model fits where the search experience actually lives in your product.

These numbers are representative, not universal. Bundle size, query latency, and memory usage all vary with vector dimensions, index parameters, browser runtime, hardware, and whether embeddings are generated on device or ahead of time.

Comparison table

Category	altor-vec	Weaviate
Runtime model	Browser-side WebAssembly index embedded in the product.	Server-side database with API, schema, and cluster concerns.
Bundle size / delivery	~54KB gzipped plus data asset.	No browser bundle for ANN, but substantial infrastructure footprint on the server side.
Query latency	Sub-millisecond local lookup once loaded. ~0.4ms p50 at 10K/384d.	Typically dominated by network and server response time, but appropriate for central search services.
Memory usage	Client memory budget matters because vectors live in the session.	Memory pressure moves to the server where larger corpora are more manageable.
Features	ANN retrieval and serialization only.	Filtering, hybrid search, modules, replication, auth, and broader data workflows.
Dataset sweet spot	Static or moderately sized shipped corpora up to ~100K vectors.	Large shared corpora with write-heavy or backend-centric needs.
Deployment	npm install altor-vec. Zero infrastructure. Works offline.	Docker, Kubernetes, or Weaviate Cloud. Requires server management.

Where altor-vec wins

Fits directly in a frontend without provisioning infrastructure.
Works offline and keeps queries on device.
Simple delivery model for docs, widgets, and embedded search.
Zero operational overhead — no containers, no clusters, no ops team.
Privacy-preserving: queries never leave the browser.

Where Weaviate wins

Broader features and operational tooling.
Better fit for complex filtering and multi-tenant datasets.
Stronger choice when embeddings and retrieval are part of a larger backend platform.
GraphQL and REST APIs for flexible querying patterns.
Hybrid search combining dense vectors and BM25 keyword search.

Honest decision guide

Weaviate wins on backend capability and scale. altor-vec wins when you want vector retrieval to behave like a frontend dependency, not a service to operate.

The honest pattern across all of these benchmark pages is simple: if the search corpus should stay on the server, choose server-oriented infrastructure. If the search corpus is intentionally shipped with the product and the UX benefit of local retrieval matters more than backend scale, altor-vec is usually the more natural fit.

A good heuristic: if your search corpus is the same for every user (documentation, product catalog, help center), it can be shipped to the browser. If your search corpus differs per user (personal documents, user-generated content, private data), it must stay on the server — and Weaviate is the better choice.

Benchmark methodology

These benchmarks measure query latency for approximate nearest-neighbor search in a controlled browser environment. All altor-vec measurements run in Chrome 124 on M2 MacBook Pro, using a pre-built HNSW index loaded from JSON. No embedding generation time is included — we measure pure retrieval latency.

Test configuration

Parameter	Value
Index size	10,000 vectors
Vector dimensions	384 (all-MiniLM-L6-v2 output)
HNSW M	16
ef_construction	200
ef_search	50
k (neighbors returned)	5
Browser	Chrome 124, M2 MacBook Pro
Measurement	p50 and p95 of 1,000 consecutive queries

altor-vec latency results

Metric	Result
p50 query latency	0.4ms
p95 query latency	0.8ms
p99 query latency	1.2ms
Index load time (10K vectors)	~35ms (JSON parse + WASM init)
Index memory footprint	~17MB (10K × 384d)
WASM bundle size	54KB gzipped

What these numbers mean for your app

A p50 of 0.4ms and p95 of 0.8ms means that for a typical 10,000-document index, search is effectively instant from the user's perspective. Human perception of "instantaneous" begins around 100ms. At sub-millisecond latency, the bottleneck is rendering results, not computing them.

For comparison, network-dependent search (any cloud API) adds a baseline of 20–150ms for the round-trip, before the server executes its own query. At 100ms total, a cloud search query takes 125× longer than an altor-vec local query at p95. Whether that matters depends entirely on your product — for autocomplete-as-you-type, the difference is significant; for triggered search (user presses Enter), it is less critical.

The 17MB memory footprint for 10K vectors at 384 dimensions fits comfortably in modern browser memory budgets. Most consumer devices have 4-8GB of RAM available to browser tabs. Practical limits are typically higher than 10K vectors for most documentation and product catalog use cases. For 100K vectors at 384 dimensions, expect approximately 170MB — viable for desktop but worth testing on mobile.

Running your own benchmark

import init, { WasmSearchEngine } from 'altor-vec';
await init();

// Build index
const vectors = new Float32Array(N * DIM); // your embeddings
const engine = WasmSearchEngine.from_vectors(vectors, DIM, 16, 200, 50);

// Benchmark query latency
const query = new Float32Array(DIM); // your query embedding
const iterations = 1000;
const times = [];

for (let i = 0; i < iterations; i++) {
  const start = performance.now();
  engine.search(query, 5);
  times.push(performance.now() - start);
}

times.sort((a, b) => a - b);
console.log('p50:', times[Math.floor(iterations * 0.5)].toFixed(2) + 'ms');
console.log('p95:', times[Math.floor(iterations * 0.95)].toFixed(2) + 'ms');
console.log('p99:', times[Math.floor(iterations * 0.99)].toFixed(2) + 'ms');

Run this in your browser console against your own index to get accurate numbers for your specific hardware, vector dimensions, and index size.

FAQ

Can I compare bundle size between altor-vec and Weaviate directly?

Only partially. Weaviate is primarily a backend system, so the real tradeoff is delivery model rather than just kilobytes. altor-vec ships 54KB gzipped to the browser; Weaviate ships nothing to the browser but requires a server running Docker or Kubernetes.

What is the biggest feature gap?

Metadata-rich backend workflows. altor-vec is deliberately not a database — it has no collections, no schemas, no filtering, and no write API. For products that need these, Weaviate is the correct tool. For products where search is purely a read operation on a shipped corpus, altor-vec is usually sufficient.

What is altor-vec's main advantage here?

You can ship it inside the app and get local search without standing up infrastructure. For a documentation site or SaaS product with a static help corpus, the entire search experience — index, engine, and UI — can be delivered as static files with no server dependency.

How was this benchmark run?

All altor-vec measurements run in Chrome 124 on M2 MacBook Pro using a pre-built HNSW index with 10,000 vectors at 384 dimensions. Latency is the p50 and p95 of 1,000 consecutive queries using performance.now(). No embedding generation time is included — we measure pure retrieval latency.

Does altor-vec performance degrade with more vectors?

Yes, but gradually. HNSW scales logarithmically, so doubling the index size increases query latency by roughly 10–15%, not 2×. A 100K-vector index at 384 dimensions delivers approximately 1.2ms p50 versus 0.4ms for 10K vectors.

How does browser WASM performance compare to native C++ for vector search?

Browser WASM is typically 3–5× slower than native C++ for HNSW search. However, it eliminates the network round-trip entirely. Even at 5× overhead, 2ms WASM beats 20–150ms cloud API latency by 10–75× for read-only workloads on static corpora.

Get started: npm install altor-vec · GitHub