• Blog
  • Docs
  • Pricing
  • We’re hiring!
Log inSign up
kamenxrider

kamenxrider

slimarmor

Semantic vector DB on Val Town SQLite — DiskANN, hybrid search
Public
Like
slimarmor
Home
Code
7
CHANGES.md
GUIDE.md
HANDOVER.md
README.md
H
api.ts
ui.ts
vectordb.ts
Environment variables
4
Branches
1
Pull requests
Remixes
History
Val Town is a collaborative website to build and scale JavaScript apps.
Deploy APIs, crons, & store data – all from the browser, and deployed in milliseconds.
Sign up now
Code
/
HANDOVER.md
Code
/
HANDOVER.md
Search
2/3/2026
Viewing readonly version of main branch: v61
View latest version
HANDOVER.md

SlimArmor - Technical Handover Document

Last Updated: 2026-02-02
Status: Production-ready v4 (with Browser CLI)


Project Overview

SlimArmor is a mini vector database for Val Town that provides semantic search capabilities using SQLite with libSQL/Turso vector extensions.

What It Does

  • Stores text with AI-generated embeddings (4096 dimensions by default)
  • Enables semantic search (search by meaning, not keywords)
  • Returns distance scores for ranking results
  • Supports any OpenAI-compatible embedding API
  • NEW: Browser CLI for interactive management
  • NEW: Metadata filtering + hybrid keyword boost
  • NEW: Batch upsert, chunked upsert, export/import

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Browser CLI (ui.ts)                       │
│  Terminal-style web UI at /ui                                │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                      HTTP API (api.ts)                       │
│  /ui, /upsert, /search, /delete, /stats, /calibrate, etc.   │
│  + Admin token authentication for write operations           │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                   Vector DB Core (vectordb.ts)               │
│  - setup(), upsert(), search(), remove(), stats()           │
│  - Content hash check (skip re-embedding unchanged text)    │
│  - Dimension assertion (fail fast on mismatch)              │
└─────────────────────────────────────────────────────────────┘
                    │                       │
                    ▼                       ▼
┌──────────────────────────┐  ┌────────────────────────────────┐
│   Embedding Provider     │  │   Val Town SQLite (Turso)      │
│   (OpenAI-compatible)    │  │   - F32_BLOB vector columns    │
│   - Nebius (default)     │  │   - libsql_vector_idx (DiskANN)│
│   - OpenAI               │  │   - vector_top_k queries       │
│   - OpenRouter           │  │   - vector_distance_cos        │
│   - Custom               │  └────────────────────────────────┘
└──────────────────────────┘

Key Files

ui.ts - Browser CLI (NEW in v4)

A terminal-style web interface for interacting with the vector database.

Features:

  • Monospace terminal aesthetic (GitHub dark theme)
  • Command history (↑/↓ arrows)
  • Clickable IDs in results (auto-fills get <id>)
  • Color-coded distance scores (green <0.5, orange 0.5-0.65, red >0.65)
  • Session-based auth token storage
  • Mobile responsive
  • Pretty tables and stats boxes (not raw JSON)

Access: GET /ui

CLI Commands:

CommandDescription
helpShow all available commands
statsStorage & index statistics
search "query" [-k 10] [--mode balanced] [--max 0.64] [--filter key=value] [--hybrid] [--alpha 0.25] [--offset 0]Semantic search
get <id>Show record details
list [--limit 20] [--prefix seed-]List record IDs
calibrate "query"Suggest distance thresholds
upsert <id> "text..."Create/update record [AUTH]
del <id>Delete record [AUTH]
seed 100Seed synthetic data [AUTH]
clear --yesDelete ALL records [AUTH]
reindexRecreate vector index [AUTH]
auth <token>Set admin token
logoutClear auth token
pingHealth check
clsClear output

Command aliases: ls → list, delete → del, ? → help

Search modes:

  • --mode tight → maxDistance 0.5 (top 3 only)
  • --mode balanced → maxDistance 0.64 (top 10)
  • --mode loose → maxDistance 0.7 (include all)

vectordb.ts - Core Library

The main library that can be imported into other vals.

Key exports:

  • setup() - Creates table and index
  • upsert(id, text, meta?) - Insert/update with smart re-embedding
  • search(query, k?, maxDistance?) - Semantic search
  • remove(id) - Delete record
  • stats() - Count and storage estimate
  • get(id) - Get single record
  • listIds(limit?) - List all IDs
  • reindex() - Recreate index
  • getProviderInfo() - Current embedding config

Configuration via env vars:

  • EMBEDDING_PROVIDER - Preset: nebius, openai, openrouter
  • EMBEDDING_API_URL - Custom API URL
  • EMBEDDING_API_KEY - API key
  • EMBEDDING_MODEL - Model name
  • EMBEDDING_DIM - Vector dimensions

api.ts - HTTP Endpoints

RESTful API layer with Browser CLI and admin token authentication.

UI:

  • GET /ui - Browser CLI interface

Core endpoints:

  • POST /upsert - Insert/update record [AUTH]
  • POST /search - Semantic search
  • POST /delete - Delete record [AUTH]
  • GET /get?id=... - Get record
  • GET /list - List IDs (supports offset/prefix)
  • POST /upsert_chunked - Chunk + upsert [AUTH]
  • GET /export - Export records [AUTH]
  • POST /import - Import records (batch upsert) [AUTH]

Admin endpoints:

  • GET / - API info + provider config
  • GET /ping - Health check
  • GET /stats - Detailed storage stats
  • GET /test - Smoke test [AUTH]
  • GET /seed?n=100 - Seed synthetic data [AUTH]
  • GET /calibrate?q=... - Threshold suggestions
  • GET /validate - Self-checks (optional write tests)
  • POST /reindex - Recreate index [AUTH]
  • POST /clear?confirm=yes - Delete all [AUTH]
  • GET /auth - Auth status

Authentication (NEW in v4):

  • Set ADMIN_TOKEN env var to enable authentication
  • Pass token via Authorization: Bearer <token> header
  • Write operations ([AUTH]) require auth when ADMIN_TOKEN is set
  • If ADMIN_TOKEN not set, all operations are open (current default)

Database Schema

CREATE TABLE vectordb ( id TEXT PRIMARY KEY, text TEXT NOT NULL, text_hash TEXT NOT NULL, -- SHA-256 for change detection embedding F32_BLOB(4096), -- Vector column (dimension varies by provider) meta_json TEXT, -- Optional JSON metadata updated_at INTEGER NOT NULL -- Unix timestamp ms ); CREATE INDEX vectordb_embedding_idx ON vectordb (libsql_vector_idx(embedding, 'metric=cosine', 'max_neighbors=64', 'compress_neighbors=float8')); CREATE VIRTUAL TABLE vectordb_fts USING fts5(id, text); CREATE TABLE vectordb_meta ( key TEXT PRIMARY KEY, value TEXT );

Index Optimizations

We tested and applied these Turso-documented optimizations:

SettingValueWhy
metric=cosineCosine distanceStandard for text embeddings
max_neighbors=6464 neighborsDown from default ~192, saves storage
compress_neighbors=float81 byte/dim75% less index storage

Trade-off: Slightly lower recall accuracy, significantly lower storage.


Verified Performance (105 records)

MetricValue
Storage per record~22 KB
Estimated max records/GB~47,500
Embedding latency (Nebius)~460ms
Search latency<100ms

Distance Distribution

From calibration with "machine learning" query:

  • Min: 0.46 (highly relevant)
  • Median: 0.64
  • Max: 0.67 (least relevant in top 20)

Recommended thresholds:

  • Tight: 0.5 (top 3 only)
  • Balanced: 0.64 (top 10)
  • Loose: 0.7 (include all)

Known Limitations

  1. Single embedding dimension - Table created with fixed dimension. Changing providers requires clearing data.

  2. Chunking is naive - Built-in chunker is character-based and may not respect sentences.

  3. Hybrid search is boost-only - Keyword scores re-rank vector results; it does not union keyword-only matches.

  4. FTS optional - If FTS5 is unavailable, hybrid search silently falls back to vector-only.

  5. Sync embedding calls - Each upsert calls embedding API synchronously. Batch support not implemented.

  6. No pagination - Search returns up to k results, no cursor-based pagination.


Future Improvements (Not Implemented)

If continuing development, consider:

  1. Hybrid union retrieval - Include keyword-only matches beyond vector candidates

  2. Background indexing - Queue-based async embedding

  3. Advanced filters - Range queries, nested JSON paths, boolean logic

  4. Cursor pagination - Stable pagination for large datasets

  5. Multi-index - Support different embedding models in same DB


Testing

Browser CLI

GET /ui

Interactive terminal interface - type help to see commands.

Smoke Test

GET /test

Inserts 5 demo records, runs searches, shows results. (Requires auth if ADMIN_TOKEN set)

Scale Test

GET /seed?n=1000

Seeds 1000 synthetic records (takes ~8 minutes). (Requires auth if ADMIN_TOKEN set)

Threshold Calibration

GET /calibrate?q=your+query

Analyzes distance distribution, suggests thresholds.


Environment Variables Reference

VariableRequiredDefaultDescription
ADMIN_TOKENNo-Enable auth for write operations
EMBEDDING_PROVIDERNonebiusPreset: nebius, openai, openrouter
NEBIUS_API_KEYIf nebius-Nebius API key
OPENAI_API_KEYIf openai-OpenAI API key
OPENROUTER_API_KEYIf openrouter-OpenRouter API key
EMBEDDING_API_URLNo(from preset)Custom API URL
EMBEDDING_API_KEYNo-Generic API key fallback
EMBEDDING_MODELNo(from preset)Override model name
EMBEDDING_DIMNo(from preset)Override dimensions (auto to detect)
INDEX_METRICNocosineVector distance metric (cosine or l2)
INDEX_MAX_NEIGHBORSNo64DiskANN neighbors (8-256)
INDEX_COMPRESS_NEIGHBORSNofloat8float8 or none
ALLOW_WRITE_TESTSNo0Set to 1 to allow write tests via /validate?write=yes

Troubleshooting

"Embedding dim mismatch"

Provider returned different dimension than expected. Check EMBEDDING_DIM env var matches your model.

"Missing API key"

Set the appropriate env var for your provider.

Search returns irrelevant results

Lower maxDistance (try 0.5 instead of 0.7).

Slow inserts

Normal - each insert requires an API call (~460ms). Batch support not implemented.

Index errors after changing providers

Clear data with POST /clear?confirm=yes and re-insert.

"Unauthorized" errors

Set ADMIN_TOKEN env var and use auth <token> in CLI, or pass Authorization: Bearer <token> header in API calls.


Code Quality Notes

  • TypeScript throughout
  • Proper error handling with typed errors
  • Parameterized SQL (no injection risk)
  • Content hash prevents unnecessary re-embedding
  • Dimension assertion fails fast on mismatch
  • 30s timeout on embedding API calls
  • AbortController for cancellation
  • Admin token auth for write protection

Session History Summary

  1. Verified Val Town SQLite supports vectors - F32_BLOB, libsql_vector_idx, vector_top_k all work
  2. Tested Nebius embedding API - Qwen3-Embedding-8B returns 4096 dims
  3. Built core vectordb.ts - upsert, search, delete, stats
  4. Added optimizations - compress_neighbors=float8, max_neighbors=64
  5. Added distance scores - Returns cosine distance in results
  6. Added maxDistance filter - Filter out low-relevance results
  7. Added admin tools - /seed, /calibrate, /stats, /clear
  8. Made multi-provider - Nebius, OpenAI, OpenRouter, custom
  9. Documented everything - README, GUIDE, HANDOVER
  10. Added Browser CLI (v4) - Terminal-style web UI at /ui with auth support

Contact / Links

  • Val: https://www.val.town/x/kamenxrider/slimarmor
  • Browser CLI: https://kamenxrider--95fbe492ffe111f0bee942dde27851f2.web.val.run/ui
  • API Endpoint: https://kamenxrider--95fbe492ffe111f0bee942dde27851f2.web.val.run
  • Module: https://esm.town/v/kamenxrider/slimarmor/vectordb.ts

This document is for the next developer/AI continuing work on SlimArmor.

FeaturesVersion controlCode intelligenceCLIMCP
Use cases
TeamsAI agentsSlackGTM
DocsShowcaseTemplatesNewestTrendingAPI examplesNPM packages
PricingNewsletterBlogAboutCareers
We’re hiring!
Brandhi@val.townStatus
X (Twitter)
Discord community
GitHub discussions
YouTube channel
Bluesky
Open Source Pledge
Terms of usePrivacy policyAbuse contact
© 2026 Val Town, Inc.