Last Updated: 2026-02-02
Status: Production-ready v4 (with Browser CLI)
SlimArmor is a mini vector database for Val Town that provides semantic search capabilities using SQLite with libSQL/Turso vector extensions.
- Stores text with AI-generated embeddings (4096 dimensions by default)
- Enables semantic search (search by meaning, not keywords)
- Returns distance scores for ranking results
- Supports any OpenAI-compatible embedding API
┌─────────────────────────────────────────────────────────────┐
│ HTTP API (api.ts) │
│ /upsert, /search, /delete, /stats, /calibrate, etc. │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Vector DB Core (vectordb.ts) │
│ - setup(), upsert(), search(), remove(), stats() │
│ - Content hash check (skip re-embedding unchanged text) │
│ - Dimension assertion (fail fast on mismatch) │
└─────────────────────────────────────────────────────────────┘
│ │
▼ ▼
┌──────────────────────────┐ ┌────────────────────────────────┐
│ Embedding Provider │ │ Val Town SQLite (Turso) │
│ (OpenAI-compatible) │ │ - F32_BLOB vector columns │
│ - Nebius (default) │ │ - libsql_vector_idx (DiskANN)│
│ - OpenAI │ │ - vector_top_k queries │
│ - OpenRouter │ │ - vector_distance_cos │
│ - Custom │ └────────────────────────────────┘
└──────────────────────────┘
A terminal-style web interface for interacting with the vector database.
Features:
- Monospace terminal aesthetic (GitHub dark theme)
- Command history (↑/↓ arrows)
- Clickable IDs in results
- Color-coded distance scores (green=good, orange=medium, red=poor)
- Session-based auth token storage
- Mobile responsive
Access: GET /ui
The main library that can be imported into other vals.
Key exports:
setup()- Creates table and indexupsert(id, text, meta?)- Insert/update with smart re-embeddingsearch(query, k?, maxDistance?)- Semantic searchremove(id)- Delete recordstats()- Count and storage estimateget(id)- Get single recordlistIds(limit?)- List all IDsreindex()- Recreate indexgetProviderInfo()- Current embedding config
Configuration via env vars:
EMBEDDING_PROVIDER- Preset: nebius, openai, openrouterEMBEDDING_API_URL- Custom API URLEMBEDDING_API_KEY- API keyEMBEDDING_MODEL- Model nameEMBEDDING_DIM- Vector dimensions
RESTful API layer with admin/testing tools.
Core endpoints:
POST /upsert- Insert/update recordPOST /search- Semantic searchPOST /delete- Delete recordGET /get?id=...- Get recordGET /list- List IDs
Admin endpoints:
GET /- API info + provider configGET /ping- Health checkGET /stats- Detailed storage statsGET /seed?n=100- Seed synthetic dataGET /calibrate?q=...- Threshold suggestionsPOST /reindex- Recreate indexPOST /clear?confirm=yes- Delete all
CREATE TABLE vectordb (
id TEXT PRIMARY KEY,
text TEXT NOT NULL,
text_hash TEXT NOT NULL, -- SHA-256 for change detection
embedding F32_BLOB(4096), -- Vector column (dimension varies by provider)
meta_json TEXT, -- Optional JSON metadata
updated_at INTEGER NOT NULL -- Unix timestamp ms
);
CREATE INDEX vectordb_embedding_idx
ON vectordb (libsql_vector_idx(embedding, 'metric=cosine', 'max_neighbors=64', 'compress_neighbors=float8'));
We tested and applied these Turso-documented optimizations:
| Setting | Value | Why |
|---|---|---|
metric=cosine | Cosine distance | Standard for text embeddings |
max_neighbors=64 | 64 neighbors | Down from default ~192, saves storage |
compress_neighbors=float8 | 1 byte/dim | 75% less index storage |
Trade-off: Slightly lower recall accuracy, significantly lower storage.
| Metric | Value |
|---|---|
| Storage per record | ~22 KB |
| Estimated max records/GB | ~47,500 |
| Embedding latency (Nebius) | ~460ms |
| Search latency | <100ms |
From calibration with "machine learning" query:
- Min: 0.46 (highly relevant)
- Median: 0.64
- Max: 0.67 (least relevant in top 20)
Recommended thresholds:
- Tight: 0.5 (top 3 only)
- Balanced: 0.64 (top 10)
- Loose: 0.7 (include all)
-
Single embedding dimension - Table created with fixed dimension. Changing providers requires clearing data.
-
No chunking - Each record = one embedding. Long documents should be pre-chunked by the user.
-
No hybrid search - Pure vector search, no FTS fallback. Could be added later.
-
Sync embedding calls - Each upsert calls embedding API synchronously. Batch support not implemented.
-
No pagination - Search returns up to k results, no cursor-based pagination.
If continuing development, consider:
-
Chunking support - Auto-split long documents, store as
docId::chunkN -
Hybrid search - Add FTS5 table, merge vector + keyword results
-
Batch embeddings - Batch multiple texts in one API call
-
Background indexing - Queue-based async embedding
-
Metadata filtering - SQL WHERE clauses on meta_json fields
-
Multi-index - Support different embedding models in same DB
GET /test
Inserts 5 demo records, runs searches, shows results.
GET /seed?n=1000
Seeds 1000 synthetic records (takes ~8 minutes).
GET /calibrate?q=your+query
Analyzes distance distribution, suggests thresholds.
| Variable | Required | Default | Description |
|---|---|---|---|
EMBEDDING_PROVIDER | No | nebius | Preset: nebius, openai, openrouter |
NEBIUS_API_KEY | If nebius | - | Nebius API key |
OPENAI_API_KEY | If openai | - | OpenAI API key |
OPENROUTER_API_KEY | If openrouter | - | OpenRouter API key |
EMBEDDING_API_URL | No | (from preset) | Custom API URL |
EMBEDDING_API_KEY | No | - | Generic API key fallback |
EMBEDDING_MODEL | No | (from preset) | Override model name |
EMBEDDING_DIM | No | (from preset) | Override dimensions |
Provider returned different dimension than expected. Check EMBEDDING_DIM env var matches your model.
Set the appropriate env var for your provider.
Lower maxDistance (try 0.5 instead of 0.7).
Normal - each insert requires an API call (~460ms). Batch support not implemented.
Clear data with POST /clear?confirm=yes and re-insert.
- TypeScript throughout
- Proper error handling with typed errors
- Parameterized SQL (no injection risk)
- Content hash prevents unnecessary re-embedding
- Dimension assertion fails fast on mismatch
- 30s timeout on embedding API calls
- AbortController for cancellation
- Verified Val Town SQLite supports vectors - F32_BLOB, libsql_vector_idx, vector_top_k all work
- Tested Nebius embedding API - Qwen3-Embedding-8B returns 4096 dims
- Built core vectordb.ts - upsert, search, delete, stats
- Added optimizations - compress_neighbors=float8, max_neighbors=64
- Added distance scores - Returns cosine distance in results
- Added maxDistance filter - Filter out low-relevance results
- Added admin tools - /seed, /calibrate, /stats, /clear
- Made multi-provider - Nebius, OpenAI, OpenRouter, custom
- Documented everything - README, GUIDE, HANDOVER
- Val: https://www.val.town/x/kamenxrider/slimarmor
- Endpoint: https://kamenxrider--95fbe492ffe111f0bee942dde27851f2.web.val.run
- Module: https://esm.town/v/kamenxrider/slimarmor/vectordb.ts
This document is for the next developer/AI continuing work on SlimArmor.