Does altor-vec need a backend in Next.js?

No. The library can load a serialized index or build one from vectors directly inside a Next.js application. You only need a backend when your corpus is private, very large, or updated continuously.

Can I add vectors after initialization in Next.js?

Yes. WasmSearchEngine exposes add_vectors(flat, dims), which is useful for appending a few new embeddings without rebuilding the whole index during a demo or a controlled sync step.

What is the biggest performance risk in Next.js?

Embedding generation and UI churn are the main risks. altor-vec search is fast; expensive models, repeated initialization, or too many renders usually dominate the user-visible delay in Next.js apps.

vector search nextjs

Vector Search in Next.js — Server & Client

Next.js developers usually hit the same search problem: keyword search is easy to ship, but it fails when the user phrase and the document phrase do not overlap. altor-vec solves the retrieval side by running HNSW vector search locally in WebAssembly. That means you can keep query latency close to the browser, eliminate per-query billing, and still expose a familiar framework component API. This guide focuses on implementation details rather than marketing claims.

Install altor-vec: npm install altor-vec

The example below uses tiny four-dimensional vectors so the code is runnable as-is and easy to understand. In production you would usually replace those manual vectors with embeddings from a model such as Xenova/all-MiniLM-L6-v2 or a build-time embedding job. The retrieval flow stays the same: install, import the WASM package, create an index, optionally add vectors, then query the engine and map result IDs back to metadata.

Step 1: install and understand the runtime boundary

Start with npm install altor-vec. The package exposes a default init() function that loads the WASM module and a WasmSearchEngine class that loads or builds an HNSW index. The important design question in Next.js is not whether vector retrieval is possible. It is where initialization should live so the index is created once, memory is released intentionally, and queries do not trigger unnecessary work on each re-render or navigation.

Step 2: import the library and create the index

The sample builds an index from a flat Float32Array. That matches the real API from the package README: WasmSearchEngine.from_vectors(flat, dims, m, ef_construction, ef_search). The four HNSW parameters here are conservative defaults for a small browser index. If you precompute a production index offline, you can instead serialize it with to_bytes() and load it using new WasmSearchEngine(bytes).

'use client';

import { useEffect, useState } from 'react';
import init, { WasmSearchEngine } from 'altor-vec';

const docs = [
  { slug: '/docs/cache', title: 'Caching guide', vector: [1, 0, 0, 0] },
  { slug: '/docs/routing', title: 'App Router patterns', vector: [0, 1, 0, 0] },
  { slug: '/docs/rag', title: 'Retrieval architecture', vector: [0, 0, 1, 0] },
];

export default function SearchPanel() {
  const [engine, setEngine] = useState<WasmSearchEngine | null>(null);
  const [rows, setRows] = useState<typeof docs>([]);

  useEffect(() => {
    (async () => {
      await init();
      const dim = 4;
      const flat = new Float32Array(docs.flatMap((doc) => doc.vector));
      const built = WasmSearchEngine.from_vectors(flat, dim, 16, 200, 50);

      const extra = { slug: '/docs/edge', title: 'Edge caching', vector: [0.88, 0.12, 0, 0] };
      docs.push(extra);
      built.add_vectors(new Float32Array(extra.vector), dim);
      setEngine(built);
    })();
  }, []);

  function search(query: [number, number, number, number]) {
    if (!engine) return;
    const hits: [number, number][] = JSON.parse(engine.search(new Float32Array(query), 3));
    setRows(hits.map(([id]) => docs[id]));
  }

  return (
    <div>
      <button onClick={() => search([0.93, 0.07, 0, 0])}>Run search</button>
      <ul>{rows.map((row) => <li key={row.slug}>{row.title}</li>)}</ul>
    </div>
  );
}

Step 3: what the code is actually doing

Install: the project adds altor-vec from npm.
Import: the code imports init and WasmSearchEngine.
Create index: manual vectors are flattened into a single Float32Array and passed into from_vectors().
Add vectors: the example appends one more vector via add_vectors() so you can see incremental updates.
Query: it converts a query vector into a Float32Array, calls search(), parses the JSON response, and maps IDs back to the in-memory document array.

That pattern is stable across browser frameworks because altor-vec is model-agnostic. The framework concerns are mostly lifecycle-related: where to hold the engine instance, how to debounce query creation, and whether embeddings run on the main thread or in a worker. If you keep those concerns separate, semantic retrieval feels surprisingly ordinary.

Performance notes specific to this framework

Keep altor-vec inside a client component, dynamic import, or worker because the browser-only path should not block server rendering.
Preload the serialized index from /public with immutable cache headers; App Router streaming and local HNSW complement each other well.
If you also render search results server-side, dedupe model downloads and avoid double-initializing search on navigation.

Published altor-vec baseline: Chrome p95 retrieval on 10K vectors / 384 dimensions is about 0.6ms, index load is about 19ms, the raw WASM binary is 117KB, and the gzipped WASM payload is 54KB. In real apps, embedding generation and rendering usually cost much more than the vector lookup itself.

When to use client-side vs server-side in Next.js

Client-side: Use the client path for docs search, command palettes, and public help content where sending vectors to the browser is acceptable.

Server-side: Use a Route Handler or Node service when you need protected documents, user-specific filters, or much larger corpora than a browser can cache comfortably.

A good rule is simple. If the data is already safe to send to every browser and you mostly care about fast semantic ranking, keep it local. If the search layer also needs to enforce business rules, security boundaries, or complex shared state, put retrieval on the server and let Next.js call it as a normal endpoint.

Production checklist

Cache the WASM and serialized index aggressively with versioned asset names.
Validate vector dimensions before every search to prevent subtle runtime errors.
Keep metadata outside the HNSW graph so result rendering stays flexible.
Measure cold start, repeated search latency, and memory on at least one mid-range mobile device.
Free the engine explicitly if you unload large indexes on navigation.

Conclusion

Next.js does not require a special semantic-search abstraction. It only needs a clean place to initialize the engine and a disciplined boundary between embedding, retrieval, and UI state. altor-vec gives you a small browser-native ANN core, while the framework handles rendering and ergonomics. If you want a developer-friendly starting point with no backend dependency, this is the shortest path: npm install altor-vec, build or load an index, and search locally.

CTA: npm install altor-vec · Star on GitHub

Related hubs: Framework guides · Use cases · Comparison pages