vitepress search
AI-Powered Search for VitePress — Replace the Default in 10 Minutes
VitePress ships with a local full-text search powered by MiniSearch. It's fast and zero-config, but it matches keywords — not intent. A user searching "how to configure rate limits" won't find your "request throttling" page unless those words overlap. This guide shows how to augment or replace VitePress search with semantic vector search using altor-vec, without ejecting from the default theme.
npm install altor-vec @huggingface/transformers tsxWhat VitePress search does and doesn't do
VitePress's built-in search uses MiniSearch under the hood — a solid 22KB inverted-index library. At build time it crawls your .md files and produces a search index. At runtime, queries are scored by BM25-style term frequency against the index.
This works well when users type exact terms from your documentation. It breaks down when user vocabulary doesn't match your content vocabulary — which is common in technical documentation, where users describe problems in their own words while documentation uses precise API terminology.
| Approach | Handles typos | Understands intent | Bundle size | Setup |
|---|---|---|---|---|
| VitePress default (MiniSearch) | No | No | 22KB | Zero config |
| altor-vec (HNSW vector) | Yes (via embeddings) | Yes | 54KB WASM | Build script + theme extension |
Overview: how this works
The implementation has three parts:
- Index build script — reads your
.vitepress/distoutput aftervitepress build, extracts text content from HTML files, generates embeddings, writes a binary index topublic/ - Theme extension — adds a search component to VitePress's default theme via
.vitepress/theme/index.tswithout replacing the whole theme - Search component — a Vue component (or vanilla JS) that loads the index, accepts queries, and renders results
Step 1: Build the search index from your compiled docs
Create scripts/build-search.mjs. This runs after vitepress build, reading the compiled HTML output to extract clean text.
// scripts/build-search.mjs
import fs from 'node:fs/promises';
import { glob } from 'glob';
import { JSDOM } from 'jsdom';
import { pipeline } from '@huggingface/transformers';
import init, { WasmSearchEngine } from 'altor-vec';
await init();
const embed = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
// Read compiled HTML from the VitePress output directory
const htmlFiles = await glob('.vitepress/dist/**/*.html');
const vectors = [];
const metadata = [];
for (let i = 0; i < htmlFiles.length; i++) {
const file = htmlFiles[i];
const html = await fs.readFile(file, 'utf8');
const dom = new JSDOM(html);
const doc = dom.window.document;
// Extract title and main content — skip nav, sidebar, footer
const title = doc.querySelector('h1')?.textContent?.trim() ?? 'Untitled';
const mainContent = doc.querySelector('.vp-doc') ?? doc.querySelector('main') ?? doc.body;
// Remove script and style tags
mainContent.querySelectorAll('script,style,nav,.aside,.sidebar').forEach(el => el.remove());
const text = mainContent.textContent?.replace(/\s+/g, ' ').trim() ?? '';
if (!text || text.length < 50) continue; // skip empty pages
const textToEmbed = `${title}\n${text.slice(0, 1000)}`;
const out = await embed(textToEmbed, { pooling: 'mean', normalize: true });
vectors.push(...Array.from(out.data));
// Build the URL from file path
const url = '/' + file
.replace('.vitepress/dist/', '')
.replace('index.html', '')
.replace('.html', '');
metadata.push({
id: vectors.length / 384 - 1,
title,
excerpt: text.slice(0, 200),
url,
});
if (i % 5 === 0) process.stdout.write(`\rProcessing ${i + 1}/${htmlFiles.length}...`);
}
const dim = 384;
const engine = WasmSearchEngine.from_vectors(new Float32Array(vectors), dim, 16, 200, 50);
await fs.writeFile('.vitepress/dist/search-index.bin', Buffer.from(engine.to_bytes()));
await fs.writeFile('.vitepress/dist/search-metadata.json', JSON.stringify(metadata));
console.log(`\nIndexed ${metadata.length} pages`);
Install jsdom for HTML parsing:
npm install -D jsdom @types/jsdom
Step 2: Wire the build script into your package.json
// package.json
{
"scripts": {
"docs:dev": "vitepress dev",
"docs:build": "vitepress build && node scripts/build-search.mjs",
"docs:preview": "vitepress preview"
}
}
Now every docs:build automatically generates the search index after VitePress finishes compiling.
Step 3: Create the search component
Create .vitepress/theme/SearchModal.vue:
<script setup lang="ts">
import { ref, onMounted, onUnmounted } from 'vue';
import init, { WasmSearchEngine } from 'altor-vec';
import { pipeline } from '@huggingface/transformers';
interface Result {
id: number;
title: string;
excerpt: string;
url: string;
score: number;
}
const open = ref(false);
const query = ref('');
const results = ref<Result[]>([]);
const loading = ref(false);
let engine: WasmSearchEngine | null = null;
let metadata: Omit<Result, 'score'>[] = [];
let embedder: Awaited<ReturnType<typeof pipeline>> | null = null;
let debounceTimer: ReturnType<typeof setTimeout>;
async function initEngine() {
if (engine) return;
await init();
const [indexBuf, meta] = await Promise.all([
fetch('/search-index.bin').then(r => r.arrayBuffer()),
fetch('/search-metadata.json').then(r => r.json()),
]);
engine = new WasmSearchEngine(new Uint8Array(indexBuf));
metadata = meta;
embedder = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
}
async function runSearch(q: string) {
if (!engine || !embedder || !q.trim()) { results.value = []; return; }
loading.value = true;
const out = await embedder(q, { pooling: 'mean', normalize: true });
const hits = JSON.parse(engine.search(new Float32Array(out.data as Float32Array), 6)) as [number, number][];
results.value = hits.map(([id, dist]) => ({ ...metadata[id], score: 1 - dist }));
loading.value = false;
}
function onInput(e: Event) {
const val = (e.target as HTMLInputElement).value;
query.value = val;
clearTimeout(debounceTimer);
debounceTimer = setTimeout(() => runSearch(val), 220);
}
function openModal() { open.value = true; initEngine(); }
function closeModal() { open.value = false; query.value = ''; results.value = []; }
function onKeydown(e: KeyboardEvent) {
if ((e.metaKey || e.ctrlKey) && e.key === 'k') { e.preventDefault(); open.value ? closeModal() : openModal(); }
if (e.key === 'Escape') closeModal();
}
onMounted(() => window.addEventListener('keydown', onKeydown));
onUnmounted(() => window.removeEventListener('keydown', onKeydown));
</script>
<template>
<button class="search-btn" @click="openModal" aria-label="Search (Cmd+K)">
<svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<circle cx="11" cy="11" r="8"/><path d="m21 21-4.35-4.35"/>
</svg>
Search <kbd>⌘K</kbd>
</button>
<Teleport to="body">
<div v-if="open" class="search-overlay" @click.self="closeModal">
<div class="search-modal">
<input
autofocus
type="search"
placeholder="Search docs..."
:value="query"
@input="onInput"
class="search-input"
/>
<p v-if="loading" class="search-hint">Searching…</p>
<p v-else-if="query && !results.length" class="search-hint">No results for "{{ query }}"</p>
<ul v-else class="search-results">
<li v-for="r in results" :key="r.id">
<a :href="r.url" @click="closeModal">
<strong>{{ r.title }}</strong>
<span>{{ r.excerpt }}</span>
<small>{{ (r.score * 100).toFixed(0) }}% match</small>
</a>
</li>
</ul>
</div>
</div>
</Teleport>
</template>
<style scoped>
.search-btn { background: transparent; border: 1px solid var(--vp-c-border); border-radius: 8px; padding: 6px 12px; cursor: pointer; font-size: 14px; color: var(--vp-c-text-2); display: flex; align-items: center; gap: 6px; }
.search-overlay { position: fixed; inset: 0; background: rgba(0,0,0,.6); z-index: 9999; display: flex; align-items: flex-start; justify-content: center; padding-top: 80px; }
.search-modal { background: var(--vp-c-bg); border: 1px solid var(--vp-c-border); border-radius: 12px; width: min(640px, 92vw); overflow: hidden; }
.search-input { width: 100%; padding: 14px 18px; font-size: 16px; border: none; outline: none; background: transparent; color: var(--vp-c-text-1); border-bottom: 1px solid var(--vp-c-border); }
.search-hint { padding: 16px 18px; color: var(--vp-c-text-3); margin: 0; font-size: 14px; }
.search-results { list-style: none; margin: 0; padding: 8px; max-height: 400px; overflow-y: auto; }
.search-results li a { display: block; padding: 10px 12px; border-radius: 8px; text-decoration: none; }
.search-results li a:hover { background: var(--vp-c-bg-soft); }
.search-results li a strong { display: block; color: var(--vp-c-text-1); font-size: 14px; margin-bottom: 2px; }
.search-results li a span { display: block; color: var(--vp-c-text-3); font-size: 13px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; }
.search-results li a small { display: block; color: var(--vp-c-brand); font-size: 11px; margin-top: 2px; }
</style>
Step 4: Register the component via theme extension
Create or update .vitepress/theme/index.ts:
// .vitepress/theme/index.ts
import DefaultTheme from 'vitepress/theme';
import SearchModal from './SearchModal.vue';
import type { Theme } from 'vitepress';
export default {
extends: DefaultTheme,
enhanceApp({ app }) {
app.component('SearchModal', SearchModal);
},
Layout() {
return h(DefaultTheme.Layout, null, {
'nav-bar-content-before': () => h(SearchModal),
});
},
} satisfies Theme;
The nav-bar-content-before slot injects the search button into VitePress's navbar before the existing content. Other available slots are nav-bar-content-after, sidebar-nav-before, and aside-top. Pick whichever placement fits your design.
Note on disabling built-in search: To disable VitePress's default search while keeping the new one, add themeConfig: { search: { provider: 'local', options: { detailedView: false } } } or set search: false in your VitePress config. The built-in search and the custom component can coexist, but two search buttons in the nav is confusing for users.
Step 5: Configure VitePress to handle WASM
VitePress uses Vite under the hood. WASM imports from altor-vec need a small config addition:
// .vitepress/config.ts
import { defineConfig } from 'vitepress';
export default defineConfig({
vite: {
optimizeDeps: {
exclude: ['altor-vec'],
},
assetsInclude: ['**/*.wasm'],
},
// ... rest of your config
});
Handling hot reload in dev mode
During vitepress dev, the .vitepress/dist directory doesn't exist yet — the dev server serves content directly from your Markdown files. The search index script reads from dist, so it can only run after a full build.
For development, you have two options:
- Run
docs:buildonce to generate the index, then usedocs:dev— the index is served statically and works in the dev server - Guard the search component with a check: if
/search-index.binreturns 404, fall back to showing VitePress's default search or a "search coming soon" message
// In SearchModal.vue — graceful fallback
async function initEngine() {
const probe = await fetch('/search-index.bin', { method: 'HEAD' });
if (!probe.ok) {
console.info('Search index not built yet. Run npm run docs:build.');
return;
}
// ... rest of init
}
Serving the index from VitePress's public directory
An alternative to writing to .vitepress/dist is writing to public/ in your VitePress root. VitePress copies everything from public/ to the output directory during build. This means you can run the index build script before vitepress build and have the files available in both dev and production:
// package.json — alternative approach
{
"scripts": {
"prebuild:search": "vitepress build",
"build:search": "node scripts/build-search-public.mjs",
"docs:build": "npm run prebuild:search && npm run build:search"
}
}
In build-search-public.mjs, write to ./public/search-index.bin instead of .vitepress/dist/. This makes the file available via /search-index.bin in both dev server and production.
Performance: index size and loading
A VitePress docs site with 150 pages at 384 dimensions produces a binary index of approximately 22MB. This loads in about 2 seconds on a typical broadband connection. To keep perceived performance high:
- Initialize the engine only when the user opens the search modal, not on page load
- Show a loading indicator while the index fetches
- Cache the binary with a long-lived cache header — add a content hash to the filename if your content changes frequently
- Use the
Xenova/all-MiniLM-L6-v2model (23MB) rather than larger models for faster first-query times
FAQ
Does this replace VitePress's built-in search entirely?
You can replace it or run both. To disable the built-in search, set search: false in your VitePress config. The custom component handles all searching independently. Running both is possible but adds UI clutter — most teams pick one.
Will this work with VitePress's default theme?
Yes. Theme extension via .vitepress/theme/index.ts is the standard VitePress pattern. You add files without ejecting from the default theme. All default theme features — sidebar, navigation, dark mode — continue to work.
How large does the index get for a typical docs site?
Roughly 150KB per 1,000 documents at 384 dimensions. A 100-page docs site produces around 15-20MB. A 500-page site produces 75-100MB. Cache the binary aggressively — it only changes when documentation changes, which is typically once per deployment.