Vue 3 guide

Chat Memory in Vue 3 with altor-vec

Q: How do I persist chat memory across browser sessions?

Call engine.to_json() and store the result in localStorage (small memory) or IndexedDB (large memory). On next session, restore with WasmSearchEngine.from_json(). Also persist your messages array to reconstruct the full conversation.

Q: How many turns of conversation history can I store?

altor-vec handles up to ~100K vectors. For chat memory, each turn is one vector — you can store 100K message turns before hitting browser memory limits. In practice, 1,000–10,000 turns is sufficient for most applications.

Q: Should I embed each message separately or chunk multiple messages together?

Embed each message turn separately for retrieval. Use a sliding window of recent turns as context for the LLM (last 5-10 turns by recency), plus the top-k semantically similar historical turns retrieved by altor-vec.

Use altor-vec to add chat memory to your Vue 3 app — entirely in the browser, with no server, no API keys, and zero per-query cost. Store conversation history as vector embeddings and retrieve the most semantically relevant past messages as context for each new turn — giving your chatbot long-term, topic-aware memory without a server.

Install: npm install altor-vec @xenova/transformers

Implementation

Uses Composition API (setup + onMounted). Uses ref() for engine and results.

Performance

10K message turns at 384 dimensions: ~17MB, <1ms retrieval. Sufficient for months of conversation history. Measured on M2 MacBook Pro, Chrome 124. Mobile is typically 2–4× slower — test on target devices before deploying.

Index size	Dimensions	Query p50	Memory
1,000 vectors	384	~0.1ms	~2MB
10,000 vectors	384	~0.4ms	~17MB
50,000 vectors	384	~0.9ms	~85MB

When this approach works best

Local-first AI assistants where conversation history must stay private on-device
Chatbots that need to recall specific past conversations by topic, not just recency
Apps where storing conversation history on a server raises compliance concerns

Limitations

Memory is session-scoped by default — persist with to_json() + localStorage for cross-session recall
Adding a new message requires calling engine.add() individually, which is slower than batch from_vectors()

Frequently asked questions

How do I persist chat memory across browser sessions?

Call engine.to_json() and store the result in localStorage (small memory) or IndexedDB (large memory). On next session, restore with WasmSearchEngine.from_json(). Also persist your messages array to reconstruct the full conversation.

How many turns of conversation history can I store?

altor-vec handles up to ~100K vectors. For chat memory, each turn is one vector — you can store 100K message turns before hitting browser memory limits. In practice, 1,000–10,000 turns is sufficient for most applications.

Should I embed each message separately or chunk multiple messages together?

Embed each message turn separately for retrieval. Use a sliding window of recent turns as context for the LLM (last 5-10 turns by recency), plus the top-k semantically similar historical turns retrieved by altor-vec.

Related resources

framework

use case

reference