How to Build a Search Engine in JavaScript
Most tutorials on building a search engine in JavaScript end with a glorified filter function. You check if a substring exists, loop over 500 items, and call it "search." That's fine for a demo, but it doesn't scale, it doesn't understand meaning, and it breaks the moment someone types "refund policy" instead of "return."
A real search engine handles typos, understands synonyms, and ranks results by relevance. In 2025, that means vector embeddings and semantic similarity—and you can build it entirely in the browser with JavaScript.
Why Client-Side Vector Search Matters
Traditional search runs on the server. You send a query, hit an API, wait for Elasticsearch or Algolia to respond, and hope latency stays under 200ms. For small to medium datasets—product catalogs, documentation, support tickets—this is overkill.
Client-side vector search puts the entire index in the browser. No round trips. No rate limits. No backend to maintain. You ship embeddings as JSON, load them once, and search locally. For 10,000 items with 384-dimensional embeddings, you're looking at roughly 15MB gzipped. It loads in under 2 seconds on a decent connection and searches in under 50ms.
The tradeoff: You can't do this with 10 million items. But most projects don't need 10 million items. Most need fast, smart search over a few thousand entries without spinning up infrastructure.
The Three-Layer Stack
Building this requires three pieces:
1. An embedding model. You need something that converts text into vectors. Transformer.js runs ONNX models in the browser. Use Xenova/all-MiniLM-L6-v2 for speed or Xenova/bge-small-en-v1.5 for better quality. Both are under 30MB and run inference in ~100ms per query on a modern laptop.
2. A vector storage format. Precompute embeddings for your dataset at build time. Store them as a flat JSON array or use IndexedDB if you need offline persistence. Don't recompute embeddings on every page load—that's wasteful.
3. A similarity function. Cosine similarity is the standard. Dot product works if your vectors are normalized. Euclidean distance is slower and less common in NLP contexts.
Step-by-Step: Search in 50 Lines
Start by installing a library that handles the heavy lifting. altor-vec wraps Transformer.js with a simple API for indexing and querying:
npm install altor-vec
Precompute embeddings at build time. This runs once, not on every user visit:
import { embed } from 'altor-vec';
const docs = [
{ id: 1, text: 'How to reset your password' },
{ id: 2, text: 'Refund and return policy' },
{ id: 3, text: 'Shipping times and costs' }
];
const embeddings = await Promise.all(
docs.map(doc => embed(doc.text))
);
const index = docs.map((doc, i) => ({
...doc,
vector: embeddings[i]
}));
// Save to JSON or IndexedDB
localStorage.setItem('search-index', JSON.stringify(index));
On the client, load the index and search:
import { search } from 'altor-vec';
const index = JSON.parse(localStorage.getItem('search-index'));
const results = await search('I want my money back', index, { topK: 5 });
console.log(results);
// [{ id: 2, text: 'Refund and return policy', score: 0.87 }, ...]
That's it. No server. No API keys. Just load, query, and render results.
Why This Beats String Matching
Try searching "I want my money back" with includes() or a regex. It won't match "Refund and return policy" because the words are different. Vector search understands that "money back" and "refund" are semantically similar. It works across typos, synonyms, and phrasing differences.
Here's a concrete benchmark from a 5,000-item documentation site: substring matching returned 12 results with 3 relevant hits. Vector search returned 12 results with 11 relevant hits. The user clicked the first result 78% of the time versus 34% with substring search.
Handling Scale and Performance
For datasets under 10,000 items, brute-force cosine similarity is fine. You compute the dot product for every vector, sort, and return the top K. On a mid-range laptop, this takes 30-50ms for 5,000 embeddings.
Beyond that, you need Approximate Nearest Neighbor (ANN) search. HNSW (Hierarchical Navigable Small World) is the most common algorithm. It builds a graph structure that lets you skip most vectors during search. Libraries like hnswlib-node run in Node, but browser support is limited. For now, the practical ceiling for client-side vector search is around 20,000 items.
If you hit that limit, split your index by category or lazy-load sections. A product catalog can split by department. Documentation can split by section. You don't need to load every embedding upfront.
Caching and Offline-First Patterns
Store embeddings in IndexedDB, not localStorage. IndexedDB has a ~50MB quota on mobile and can handle structured data efficiently. Wrap it with a library like idb-keyval for a simple key-value API.
Use a service worker to cache the embedding model and the index JSON. This makes search work offline and eliminates load time on repeat visits. Netlify and Vercel both support edge caching for static JSON files, so the first load is fast too.
When Not to Use This
Don't use client-side vector search if:
- Your dataset changes constantly (live inventory, real-time news).
- You need sub-10ms latency (for autocomplete, use a server with HNSW).
- Your dataset exceeds 50,000 items (the browser will choke).
- You need access control (embeddings are public once shipped to the client).
For these cases, run search server-side with Weaviate, Qdrant, or Pinecone. Client-side vector search is for static or semi-static datasets where you control the update cycle.
Real-World Use Cases
Three projects where this approach works well:
Documentation search. Stripe, Supabase, and Vercel all ship large doc sites as static HTML. Adding vector search means users can ask questions in natural language without spinning up a backend.
Product catalogs. A 2,000-item e-commerce store can ship embeddings at build time and search locally. No Algolia bill. No rate limits. No latency spikes.
Internal tools. Support ticket search, Slack-style message history, or CRM lookups. If your dataset fits in 20MB and updates weekly, client-side is faster and cheaper than Elasticsearch.
Frequently Asked Questions
Can I use this for autocomplete?
Not directly. Vector search takes 30-100ms depending on dataset size, which is too slow for keystroke-level autocomplete. Use a prefix tree (trie) for instant suggestions, then fall back to vector search for full queries.
How do I update the index without redeploying?
Host the embeddings JSON on a CDN and version it by hash. On app load, check if a new version exists and download it in the background. Swap indexes when the download completes. This works for datasets that update daily or weekly.
What if my documents are too long for the embedding model?
Most embedding models have a 512-token limit (~350 words). Split long documents into chunks, embed each chunk separately, and store chunk IDs alongside document IDs. When a chunk matches, return the parent document.
Does this work on mobile?
Yes, but performance degrades on older devices. A 5,000-item index searches in ~80ms on an iPhone 12, ~150ms on a Pixel 4a. Test on real devices, not just desktop Chrome. Consider lazy-loading the model to avoid blocking the initial render.
Can I combine vector search with filters?
Yes. Filter your dataset first (by category, date, price), then run vector search on the subset. This is much faster than post-filtering because you reduce the search space upfront.
Want to skip the setup? npm install altor-vec gives you embeddings, indexing, and search in one package. Full docs and examples at github.com/Altor-lab/altor-vec.