Everything you need to go from zero to running semantic search in your own Val Town project.
A vector database — a database that stores text and lets you search it by meaning, not just exact words.
Real example:
- You store:
"The patient requires immediate surgery" - You search:
"medical emergency" - It finds it — even though none of the words match ✅
This works because text is converted into embeddings — lists of numbers that capture meaning. Similar meanings → similar numbers → measurable similarity. SlimArmor stores those numbers, indexes them for fast retrieval, and gives you a simple API on top.
What can you build with this?
- 🤖 RAG chatbots — give an AI access to your docs, support articles, or knowledge base
- 🔍 Semantic site search — users find content even when they use different words
- 💬 Support ticket deduplication — auto-detect repeated questions
- 📚 Personal knowledge search — search your notes, bookmarks, and ideas by meaning
- 🛍️ Product recommendations — "similar items" without needing purchase history
- 🧑💼 Resume / candidate matching — match job descriptions to applicants
All of these use the same three-step pattern: store text → search by meaning → act on results. See Part 4 for full working code for each.
- Go to val.town/x/kamenxrider/slimarmor
- Click Fork (top right)
- Your own copy is now live!
SlimArmor works with any OpenAI-compatible embedding API. You just need an API key and to know what dimensions your chosen model outputs.
💡 What are dimensions? When text is converted to an embedding, it becomes a list of numbers — the "dimension" is how many numbers long that list is. SlimArmor bakes this number into the database schema when it first runs, so you need to pick a model and stick with it. Changing models later requires a full reset.
Recommended options:
| Provider | Model | Dimensions | Sign up |
|---|---|---|---|
| Nebius (default) | Qwen/Qwen3-Embedding-8B | 4096 | nebius.com — free tier |
| OpenAI | text-embedding-3-small | 1536 | platform.openai.com |
| OpenAI | text-embedding-3-large | 3072 | platform.openai.com |
| Any other | your choice | check docs | — |
Higher dimensions = better quality but more storage used. For most use cases, any of the above work great.
Pick one, get your API key, and move on.
In your forked val on Val Town:
- Click Settings (the gear icon)
- Go to Environment Variables
- Add these based on your chosen provider:
If using Nebius (default):
| Key | Value |
|---|---|
NEBIUS_API_KEY | Your Nebius API key |
If using OpenAI:
| Key | Value |
|---|---|
EMBEDDING_PROVIDER | openai |
OPENAI_API_KEY | Your OpenAI API key |
If using any other OpenAI-compatible API:
| Key | Value |
|---|---|
EMBEDDING_API_URL | Your provider's /v1/embeddings URL |
EMBEDDING_API_KEY | Your API key |
EMBEDDING_MODEL | Your model name |
EMBEDDING_DIM | Your model's output dimensions (e.g. 768) |
Always recommended:
| Key | Value |
|---|---|
ADMIN_TOKEN | Any secret string (e.g. my-secret-123) |
ADMIN_TOKEN protects your write endpoints. Without it, anyone can add or delete your data.
Click on api.ts in your val. At the top you'll see an endpoint URL like:
https://yourusername--abc123.web.val.run
Open that URL in a browser — you should see the API info page. That's your SlimArmor instance! 🎉
Visit https://YOUR_ENDPOINT/ui for a terminal-style interface.
Type help to see all commands. To add your first record:
auth your-admin-token
upsert my-first-note "The quick brown fox jumps over the lazy dog"
Then search:
search "animals jumping"
# Replace with your actual values ENDPOINT="https://YOUR_ENDPOINT" TOKEN="your-admin-token" # Add a record curl -X POST $ENDPOINT/upsert \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $TOKEN" \ -d '{"id": "note-1", "text": "The quick brown fox jumps over the lazy dog"}' # Search curl -X POST $ENDPOINT/search \ -H "Content-Type: application/json" \ -d '{"query": "animals jumping", "k": 5}'
const ENDPOINT = "https://YOUR_ENDPOINT";
const TOKEN = "your-admin-token";
// Add a record
await fetch(`${ENDPOINT}/upsert`, {
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": `Bearer ${TOKEN}`,
},
body: JSON.stringify({
id: "note-1",
text: "The quick brown fox jumps over the lazy dog",
meta: { category: "example" },
}),
});
// Search
const res = await fetch(`${ENDPOINT}/search`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ query: "animals jumping", k: 5 }),
});
const { results } = await res.json();
console.log(results);
A search result looks like this:
{ "id": "note-1", "text": "The quick brown fox jumps over the lazy dog", "meta": { "category": "example" }, "distance": 0.52 }
The key field is distance — it tells you how similar the result is to your query:
| Distance | What it means | Should you include it? |
|---|---|---|
| 0.0 – 0.3 | Near-identical meaning | Always ✅ |
| 0.3 – 0.5 | Very similar | Yes ✅ |
| 0.5 – 0.65 | Related | Usually ✅ |
| 0.65 – 0.75 | Loosely related | Maybe ⚠️ |
| 0.75+ | Probably unrelated | No ❌ |
Use maxDistance to filter out weak matches:
{ "query": "animals jumping", "k": 10, "maxDistance": 0.65 }
Not sure what threshold to use? Use the calibrate endpoint:
GET /calibrate?q=your+search+query
It analyzes your actual data and recommends tight/balanced/loose thresholds.
A vector database is a general-purpose building block. The pattern is always the same:
Store text → Search by meaning → Do something with the results
What changes is what you store and what you do with the results. Here are the most common patterns.
The problem: LLMs like GPT-4 or Claude only know what's in their training data. They don't know your docs, your product, your internal knowledge.
The solution: Before sending a user's question to the LLM, search your vector DB for relevant context, then inject it into the prompt. This is called Retrieval-Augmented Generation (RAG).
User asks question
│
▼
Search SlimArmor for relevant chunks
│
▼
Inject top results into LLM prompt as context
│
▼
LLM answers using YOUR data
Step 1 — Ingest your knowledge base (do this once, or whenever docs change):
// Load your docs — could be from a CMS, markdown files, support articles, etc.
const docs = [
{ id: "doc-refund-policy", text: "Refunds are available within 30 days of purchase. To request a refund, contact support@example.com with your order ID. Digital products are non-refundable after download." },
{ id: "doc-shipping", text: "We ship to 50+ countries. Standard shipping takes 5-7 business days. Express shipping (2-3 days) is available for an extra $12. Free shipping on orders over $75." },
{ id: "doc-account-setup", text: "To create an account, click Sign Up on the homepage. Enter your email and choose a password. You'll receive a verification email — click the link to activate your account." },
];
// For longer docs, use upsert_chunked to split automatically
await fetch(`${ENDPOINT}/upsert`, {
method: "POST",
headers: { "Content-Type": "application/json", "Authorization": `Bearer ${TOKEN}` },
body: JSON.stringify(docs),
});
Step 2 — Answer questions with context (on every user message):
async function askWithContext(userQuestion: string): Promise<string> {
// 1. Find the most relevant docs for this question
const searchRes = await fetch(`${ENDPOINT}/search`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ query: userQuestion, k: 3, maxDistance: 0.65 }),
});
const { results } = await searchRes.json();
// 2. Build a context block from the top results
const context = results.length > 0
? results.map((r: any) => r.text).join("\n\n")
: "No relevant information found.";
// 3. Send to your LLM with the context injected
const llmRes = await fetch("https://api.openai.com/v1/chat/completions", {
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": `Bearer ${Deno.env.get("OPENAI_API_KEY")}`,
},
body: JSON.stringify({
model: "gpt-4o-mini",
messages: [
{
role: "system",
content: `You are a helpful support assistant. Answer the user's question using ONLY the context below. If the answer isn't in the context, say so.\n\nContext:\n${context}`,
},
{ role: "user", content: userQuestion },
],
}),
});
const llmData = await llmRes.json();
return llmData.choices[0].message.content;
}
// Usage
const answer = await askWithContext("Can I get a refund on a digital product?");
// → "No, digital products are non-refundable after download according to our policy."
Tips for RAG:
- Use
upsert_chunkedfor long documents (splits into ~800 char overlapping chunks) - Store
source,url,sectioninmetaso you can cite your sources k: 3is usually enough — sending 10 chunks bloats the prompt unnecessarily- Re-ingest docs whenever they change — dedup means unchanged chunks are skipped for free
The problem: Regular search (LIKE '%query%') only matches exact words. Users search for "pricing" but your page says "plans and billing". No match.
The solution: SlimArmor finds results by meaning. "Pricing" → finds "plans and billing".
Index your content (run whenever content changes):
// Crawl your pages / pull from your CMS
const pages = [
{ id: "page-home", text: "Welcome to Acme. We make project management tools for remote teams.", meta: { url: "/", title: "Home" } },
{ id: "page-pricing", text: "Plans start at $9/month for individuals. Team plans from $29/month. Enterprise pricing available.", meta: { url: "/pricing", title: "Pricing" } },
{ id: "page-blog-1", text: "How async work transformed our remote team's productivity...", meta: { url: "/blog/async-work", title: "Async Work Guide" } },
];
await fetch(`${ENDPOINT}/upsert`, {
method: "POST",
headers: { "Content-Type": "application/json", "Authorization": `Bearer ${TOKEN}` },
body: JSON.stringify(pages),
});
Search endpoint (called on every user search):
export default async function(req: Request): Promise<Response> {
const url = new URL(req.url);
const query = url.searchParams.get("q");
if (!query) return Response.json({ results: [] });
const res = await fetch(`${ENDPOINT}/search`, {
method: "POST",
headers: { "Content-Type": "application/json" },
// Hybrid mode boosts exact keyword matches too (e.g. product names)
body: JSON.stringify({ query, k: 8, maxDistance: 0.7, hybrid: { enabled: true, alpha: 0.2 } }),
});
const { results } = await res.json();
// Return just what the UI needs
return Response.json({
results: results.map((r: any) => ({
title: r.meta.title,
url: r.meta.url,
excerpt: r.text.slice(0, 150) + "...",
relevance: Math.round((1 - r.distance) * 100) + "%",
})),
});
}
The problem: Your support inbox has 1000 tickets. Half of them are the same question phrased differently. Agents waste time answering duplicates.
The solution: When a new ticket arrives, search for similar existing tickets. If one is found (below distance threshold), auto-suggest the previous answer.
async function handleNewTicket(ticketId: string, ticketText: string) {
// 1. Check if a similar ticket already exists
const res = await fetch(`${ENDPOINT}/search`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
query: ticketText,
k: 3,
maxDistance: 0.45, // tight threshold — only very similar tickets
filters: { status: "resolved" }, // only look at solved tickets
}),
});
const { results } = await res.json();
if (results.length > 0) {
const similar = results[0];
console.log(`Similar resolved ticket found: ${similar.id}`);
console.log(`Suggested answer: ${similar.meta.resolution}`);
// → Auto-reply, tag the ticket, or route to a specific agent
} else {
// No match — store this ticket for future dedup
await fetch(`${ENDPOINT}/upsert`, {
method: "POST",
headers: { "Content-Type": "application/json", "Authorization": `Bearer ${TOKEN}` },
body: JSON.stringify({
id: ticketId,
text: ticketText,
meta: { status: "open", created_at: Date.now() },
}),
});
}
}
// When a ticket is resolved, update its status + store the resolution
async function resolveTicket(ticketId: string, resolution: string) {
const existing = await fetch(`${ENDPOINT}/get?id=${ticketId}`).then(r => r.json());
await fetch(`${ENDPOINT}/upsert`, {
method: "POST",
headers: { "Content-Type": "application/json", "Authorization": `Bearer ${TOKEN}` },
body: JSON.stringify({
id: ticketId,
text: existing.record.text, // same text = no re-embed (free dedup)
meta: { status: "resolved", resolution },
}),
});
}
The problem: You have hundreds of notes, bookmarks, and ideas scattered across Notion, Apple Notes, emails. You can't find anything.
The solution: Dump everything into SlimArmor. Search by what you remember about the content, not the exact words you used.
// Ingest notes from any source
const notes = [
{
id: "note-20240115",
text: "Interesting idea from the Lex podcast — compounding knowledge is like compounding interest. Small daily inputs create exponential output over years. Related to the 'second brain' concept.",
meta: { source: "podcast", date: "2024-01-15", tags: ["learning", "productivity"] },
},
{
id: "bookmark-stripe-docs",
text: "Stripe webhook best practices: always verify the signature, use idempotency keys, handle retries gracefully. Events can arrive out of order.",
meta: { source: "bookmark", url: "https://stripe.com/docs/webhooks", tags: ["stripe", "engineering"] },
},
];
await fetch(`${ENDPOINT}/upsert`, {
method: "POST",
headers: { "Content-Type": "application/json", "Authorization": `Bearer ${TOKEN}` },
body: JSON.stringify(notes),
});
// Later — search by vague memory
const res = await fetch(`${ENDPOINT}/search`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
query: "that thing about knowledge building up over time",
k: 5,
maxDistance: 0.7,
hybrid: { enabled: true, alpha: 0.15 },
}),
});
// → Finds the podcast note about compounding knowledge ✅
// Filter by tag
const engineeringNotes = await fetch(`${ENDPOINT}/search`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
query: "stripe payment processing",
k: 10,
filters: { source: "bookmark" },
}),
});
The problem: "You might also like..." — traditional recommendation engines need collaborative filtering data (lots of users, lots of purchase history). You just launched and have neither.
The solution: Use text embeddings on product descriptions. Products with similar descriptions will have similar embeddings → similar recommendations, zero training data required.
// Ingest your product catalog
const products = [
{ id: "prod-001", text: "Mechanical keyboard with Cherry MX Blue switches. Tactile feedback, clicky sound, ideal for typing enthusiasts and developers.", meta: { price: 129, category: "keyboards" } },
{ id: "prod-002", text: "Wireless mechanical keyboard, Cherry MX Red switches. Silent, linear actuation. Great for office use.", meta: { price: 149, category: "keyboards" } },
{ id: "prod-003", text: "Ergonomic split keyboard, low profile switches. Reduces wrist strain for long typing sessions.", meta: { price: 199, category: "keyboards" } },
{ id: "prod-004", text: "USB-C mechanical keyboard with RGB backlighting. Hot-swappable switches, aluminum chassis.", meta: { price: 179, category: "keyboards" } },
];
await fetch(`${ENDPOINT}/upsert`, {
method: "POST",
headers: { "Content-Type": "application/json", "Authorization": `Bearer ${TOKEN}` },
body: JSON.stringify(products),
});
// When a user views prod-001, find similar products
async function getRecommendations(productId: string, currentProductText: string) {
const res = await fetch(`${ENDPOINT}/search`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
query: currentProductText,
k: 4,
maxDistance: 0.6,
}),
});
const { results } = await res.json();
// Exclude the current product from results
return results.filter((r: any) => r.id !== productId);
}
const recs = await getRecommendations("prod-001", products[0].text);
// → Returns prod-002, prod-004 (similar keyboards) before prod-003
The problem: You have 500 resumes and 10 open roles. Manual matching takes days.
The solution: Embed both job descriptions and resumes. Search job descriptions against the resume pool to find the best candidates.
// Store resumes
await fetch(`${ENDPOINT}/upsert`, {
method: "POST",
headers: { "Content-Type": "application/json", "Authorization": `Bearer ${TOKEN}` },
body: JSON.stringify([
{
id: "resume-alice",
text: "5 years TypeScript and React. Built large-scale SPAs. Led frontend team of 4. Experience with performance optimization and accessibility.",
meta: { name: "Alice", email: "alice@example.com", years_exp: 5 },
},
{
id: "resume-bob",
text: "Backend engineer, 7 years Python and Go. Designed distributed systems, Kafka, Kubernetes. Strong on reliability and observability.",
meta: { name: "Bob", email: "bob@example.com", years_exp: 7 },
},
]),
});
// Search by job description to find matching candidates
const jobDescription = "We need a senior frontend engineer with React expertise to lead our web performance initiatives.";
const res = await fetch(`${ENDPOINT}/search`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ query: jobDescription, k: 10, maxDistance: 0.65 }),
});
const { results } = await res.json();
// → Alice surfaces first — her resume is semantically closest to the job description ✅
Every use case above follows this structure:
// 1. INGEST — store your text content with metadata
await fetch(`${ENDPOINT}/upsert`, {
method: "POST",
headers: { "Content-Type": "application/json", "Authorization": `Bearer ${TOKEN}` },
body: JSON.stringify([
{ id: "unique-id", text: "The content to search", meta: { any: "extra data" } },
]),
});
// 2. QUERY — find semantically similar content
const res = await fetch(`${ENDPOINT}/search`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ query: "what the user is looking for", k: 5, maxDistance: 0.65 }),
});
const { results } = await res.json();
// 3. ACT — do something with the results
// → Show them to a user, inject into an LLM prompt, trigger a workflow, etc.
for (const result of results) {
console.log(result.id, result.text, result.meta, result.distance);
}
The vector database doesn't care what you store. It just finds things that mean similar things. The creativity is in what you put in and what you do with what comes out.
Instead of using the HTTP API, you can import SlimArmor's core directly into another val:
import * as db from "https://esm.town/v/kamenxrider/slimarmor/vectordb.ts";
export default async function handler(req: Request) {
// Setup runs once per cold start (idempotent)
await db.setup();
const url = new URL(req.url);
if (req.method === "POST" && url.pathname === "/add") {
const { id, text } = await req.json();
await db.upsert(id, text);
return Response.json({ ok: true });
}
if (req.method === "POST" && url.pathname === "/find") {
const { query } = await req.json();
const results = await db.search(query, 5, 0.65);
return Response.json({ results });
}
return new Response("Not found", { status: 404 });
}
The library uses your val's own SQLite database — you don't need to run the API separately. Just import and use.
- Use meaningful IDs —
blog-post-2024-01is better than1 - Keep text focused — shorter, topic-focused chunks search better than walls of text
- Use metadata — store category, date, author, tags etc. so you can filter later
- Calibrate your threshold — use
/calibrate?q=...before going to production - Batch your upserts — send arrays of records instead of one at a time (much faster)
- Storing empty or near-duplicate text — SlimArmor deduplicates by content hash, so identical text won't re-embed, but similar-but-different text will generate redundant embeddings
- Deleting via raw SQL — always use
POST /clearorPOST /deleteso the vector index stays in sync - Switching models without clearing — embeddings from different models are completely incompatible. A vector from model A is meaningless when compared to a vector from model B. Always export first, then clear, then re-import with the new model.
- Deduplication is automatic — if you upsert the same
idwith the same text, it skips the embedding API call and only updates metadata. You can safely re-run upserts. - Hybrid search helps with specific terms — if your data has product codes, names, or exact terms, enable
hybrid: { enabled: true }to boost keyword matches. /validateis your friend — run it after setup to confirm everything is working before adding real data.
Make sure you're sending the header: Authorization: Bearer YOUR_ADMIN_TOKEN
In the browser CLI, type auth your-token first.
Your API key is wrong or expired. Go to your val's Settings → Environment Variables and update the relevant key (NEBIUS_API_KEY, OPENAI_API_KEY, or EMBEDDING_API_KEY depending on your provider).
- Run
/calibrate?q=your+queryto see distance distributions - Try lowering
maxDistance - Try enabling hybrid search:
"hybrid": {"enabled": true}
The DiskANN index got out of sync (happens if you manually deleted rows via SQL). Fix it with:
curl -X POST $ENDPOINT/reindex -H "Authorization: Bearer $TOKEN"
Normal — each batch of records requires one API call to the embedding provider (~460ms). For bulk imports, batch as many records as possible in each /upsert call (arrays of up to ~96 records per batch are ideal).
ENDPOINT="https://YOUR_ENDPOINT" TOKEN="your-admin-token" # Health check curl $ENDPOINT/ping # View stats curl $ENDPOINT/stats # Add one record curl -X POST $ENDPOINT/upsert -H "Content-Type: application/json" -H "Authorization: Bearer $TOKEN" \ -d '{"id":"doc-1","text":"Your text here","meta":{"category":"notes"}}' # Add many records curl -X POST $ENDPOINT/upsert -H "Content-Type: application/json" -H "Authorization: Bearer $TOKEN" \ -d '[{"id":"a","text":"first"},{"id":"b","text":"second"}]' # Search curl -X POST $ENDPOINT/search -H "Content-Type: application/json" \ -d '{"query":"your query","k":10,"maxDistance":0.65}' # Search with filter curl -X POST $ENDPOINT/search -H "Content-Type: application/json" \ -d '{"query":"your query","k":10,"filters":{"category":"notes"}}' # Get a record curl "$ENDPOINT/get?id=doc-1" # List IDs curl "$ENDPOINT/list?limit=20" # Delete a record curl -X POST $ENDPOINT/delete -H "Content-Type: application/json" -H "Authorization: Bearer $TOKEN" \ -d '{"id":"doc-1"}' # Calibrate threshold curl "$ENDPOINT/calibrate?q=your+query" # Seed test data curl -H "Authorization: Bearer $TOKEN" "$ENDPOINT/seed?n=50" # Export curl -H "Authorization: Bearer $TOKEN" "$ENDPOINT/export?limit=500" # Clear all (careful!) curl -X POST "$ENDPOINT/clear?confirm=yes" -H "Authorization: Bearer $TOKEN" # Rebuild index curl -X POST $ENDPOINT/reindex -H "Authorization: Bearer $TOKEN"
Happy searching! 🔍 If you get stuck, open the /ui browser CLI and type help.