benchmark comparison

altor-vec vs Milvus Lite

Embedded/server vector database tradeoffs versus fully in-browser retrieval.

Milvus Lite narrows the gap between heavyweight database infrastructure and lightweight local experimentation, but it still occupies a different place in the stack than a browser bundle you ship to every user.

These numbers are representative, not universal. Bundle size, query latency, and memory usage all vary with vector dimensions, index parameters, browser runtime, hardware, and whether embeddings are generated on device or ahead of time.

Comparison table

Category	altor-vec	Milvus Lite
Runtime model	Client-side WebAssembly ANN in the user's browser.	Embedded or lightweight database-style vector runtime outside the normal browser-delivery model.
Bundle size / delivery	~54KB gzipped plus vector asset.	Not typically framed as a tiny client bundle; deployment model is closer to an embedded or server-side service.
Query latency	Immediate local lookup for the current session.	Fast in its own environment, but not aimed at the exact same in-browser interaction pattern.
Memory usage	Client memory bound by the shipped corpus.	Memory managed by the embedded or server runtime instead of the browser tab.
Features	ANN retrieval and serialization only.	Database-style vector storage and operational patterns beyond a small browser primitive.
Dataset sweet spot	Safe-to-ship, moderate-size corpora.	Larger or more mutable datasets where database semantics matter.

Where altor-vec wins

True browser-native delivery.
Offline-capable local UX.
No service boundary between the UI and search.

Where Milvus Lite wins

Better fit for mutable and backend-managed data.
More database-like operational model.
Stronger when the dataset should not be bundled into the client.

Honest decision guide

Milvus Lite is closer to embedded database infrastructure. altor-vec is closer to a frontend dependency. The right choice depends on which role you actually need.

The honest pattern across all of these benchmark pages is simple: if the search corpus should stay on the server, choose server-oriented infrastructure. If the search corpus is intentionally shipped with the product and the UX benefit of local retrieval matters more than backend scale, altor-vec is usually the more natural fit.

FAQ

Why compare Milvus Lite and altor-vec?

Because both may appear 'lightweight' compared with large hosted systems, but they still target different runtime models.

When is Milvus Lite the better fit?

When you need database semantics or local/server-side storage that should not be bundled into the browser.

When is altor-vec the better fit?

When the end product is a browser app that should search locally without another runtime boundary.

Get started: npm install altor-vec · GitHub

Benchmark methodology

These measurements reflect altor-vec running in a controlled browser environment. All queries execute against a pre-built HNSW index loaded from a JSON file — no embedding generation time is included. Embeddings are generated once at build time.

Parameter	Value
Index size	10,000 vectors
Dimensions	384 (all-MiniLM-L6-v2)
HNSW M	16
ef_construction / ef_search	200 / 50
k	5
Browser	Chrome 124, M2 MacBook Pro
Measurement	p50/p95 of 1,000 consecutive queries

altor-vec latency (10K × 384d)

Metric	Result
p50 query latency	0.4ms
p95 query latency	0.8ms
Index load time	~35ms
Memory footprint	~17MB
WASM bundle size	54KB gzipped

What these numbers mean

Sub-millisecond latency means search is effectively instant from the user's perspective. Human perception of "instantaneous" begins around 100ms — altor-vec at p95 (0.8ms) is 125× faster than a cloud search call at 100ms total round-trip.

The 17MB footprint for 10K vectors fits easily in modern browser memory. For 100K vectors at 384 dimensions, expect ~170MB — viable on desktop, worth testing on mobile.

Run your own benchmark

import init, { WasmSearchEngine } from 'altor-vec';
await init();
const engine = WasmSearchEngine.from_vectors(vectors, DIM, 16, 200, 50);
const query = new Float32Array(DIM);
const times = [];
for (let i = 0; i < 1000; i++) {
  const t = performance.now();
  engine.search(query, 5);
  times.push(performance.now() - t);
}
times.sort((a, b) => a - b);
console.log('p50:', times[500].toFixed(2) + 'ms');
console.log('p95:', times[950].toFixed(2) + 'ms');