• Blog
  • Docs
  • Pricing
  • We’re hiring!
Log inSign up
kamenxrider

kamenxrider

slimarmor

Semantic vector DB on Val Town SQLite — DiskANN, hybrid search
Public
Like
slimarmor
Home
Code
8
CHANGES.md
GUIDE.md
HANDOVER.md
RAG_CHATBOT.md
README.md
H
api.ts
ui.ts
vectordb.ts
Environment variables
4
Branches
1
Pull requests
Remixes
History
Val Town is a collaborative website to build and scale JavaScript apps.
Deploy APIs, crons, & store data – all from the browser, and deployed in milliseconds.
Sign up now
Code
/
HANDOVER.md
Code
/
HANDOVER.md
Search
…
HANDOVER.md

SlimArmor — Technical Handover

Version: 4.1
Status: Production-ready
Last Updated: 2026-03-02

This document is for developers (or AI assistants) continuing work on SlimArmor. It covers architecture, design decisions, known limitations, and what to do next.


What This Is

SlimArmor is a self-hosted vector database for Val Town. It runs entirely on Val Town's built-in SQLite (powered by Turso/libSQL), which has native vector extensions (DiskANN index, F32_BLOB columns, vector_top_k). No external database is required.

The system consists of:

  • vectordb.ts — the core library (importable by other vals)
  • api.ts — the HTTP API layer
  • ui.ts — the browser CLI served at /ui

Architecture

Browser CLI (ui.ts)
      │
      ▼
HTTP API (api.ts)
  - Route matching
  - Input validation
  - Auth (bearer token)
  - Error handling
      │
      ▼
Vector DB Core (vectordb.ts)
  - setup()         Creates tables + DiskANN index (idempotent, guarded)
  - upsert()        Single record: hash check → embed → write
  - upsertMany()    Batch: hash check → batch embed → sqlite.batch() write
  - search()        embed query → vector_top_k → optional filter → optional hybrid
  - remove()        DELETE + FTS sync
  - stats()         COUNT + storage estimate
  - listIds()       Paginated ID list
  - get()           Single record fetch
  - reindex()       DROP + CREATE INDEX
      │                          │
      ▼                          ▼
Embedding API          Val Town SQLite (libSQL)
(OpenAI-compat)         - vectordb table (F32_BLOB)
  batch size: 96         - vectordb_embedding_idx (DiskANN)
  timeout: 30s           - vectordb_fts (FTS5)
                         - vectordb_meta

Key Design Decisions

1. Content-hash deduplication

Every record stores a SHA-256 hash of its text. On upsert, we compare hashes — if unchanged, we skip the embedding API call and only update metadata. This saves real money at scale (embedding APIs charge per token).

2. sqlite.batch() for bulk writes (v4.1)

upsertMany previously looped with individual sqlite.execute() calls. Now it builds arrays of statements and calls sqlite.batch([...]) once. This reduces write latency by ~3× for large batches. The FTS DELETE + INSERT pairs are included in the same batch.

3. _setupDone module-level guard (v4.1)

setup() was called on every non-UI request. Even though CREATE TABLE IF NOT EXISTS is idempotent, it still hits SQLite 3–5 times. The guard flag makes subsequent warm calls a no-op (instant return). Resets on cold start — which is fine, since setup only needs to run once per process lifetime.

4. FTS5 as optional enhancement

FTS5 support is detected at runtime and cached in FTS_AVAILABLE. If unavailable, hybrid search gracefully falls back to pure vector. This keeps the core robust across environments.

5. DiskANN index configuration

The index is created with these non-default settings for storage efficiency:

  • max_neighbors=64 — reduces from ~192 default; saves index storage
  • compress_neighbors=float8 — 75% compression vs float32; small accuracy tradeoff
  • metric=cosine — appropriate for text embeddings

These can all be overridden via env vars and applied with POST /reindex.

6. EMBEDDING_DIM=auto

When set, the first request probes the embedding API with a test string to detect the dimension, then stores it in vectordb_meta. Subsequent requests read from there. This avoids schema mismatches when switching models.


Database Schema (as created)

CREATE TABLE vectordb_meta ( key TEXT PRIMARY KEY, value TEXT ); CREATE TABLE vectordb ( id TEXT PRIMARY KEY, text TEXT NOT NULL, text_hash TEXT NOT NULL, -- SHA-256 of text content embedding F32_BLOB(4096), -- actual dim varies by provider meta_json TEXT, -- JSON blob, nullable updated_at INTEGER NOT NULL -- Unix ms ); CREATE INDEX vectordb_embedding_idx ON vectordb (libsql_vector_idx(embedding, 'metric=cosine', 'max_neighbors=64', 'compress_neighbors=float8' )); CREATE VIRTUAL TABLE vectordb_fts USING fts5(id, text);

Shadow tables created automatically by libSQL:

  • vectordb_embedding_idx_shadow
  • vectordb_embedding_idx_shadow_idx
  • libsql_vector_meta_shadow
  • vectordb_fts_config, vectordb_fts_content, vectordb_fts_data, vectordb_fts_docsize, vectordb_fts_idx

Critical Warning: Don't Manually Delete Rows

If you DELETE FROM vectordb via raw SQL, the DiskANN shadow tables go out of sync. Subsequent inserts will fail with:

SQLITE_UNKNOWN: SQLite error: vector index(insert): failed to insert shadow row

Fix: Call POST /reindex to drop and recreate the index.

Prevention: Always use POST /clear?confirm=yes which handles both the main table and FTS atomically.


Embedding API Details

SlimArmor is provider-agnostic — any OpenAI-compatible embedding API works. The provider selection is purely a convenience layer for pre-filling API URLs and default model names. What actually matters:

  • Protocol: POST /v1/embeddings with body {"model": "...", "input": ["text1", "text2", ...]}
  • Response: standard OpenAI format with data[].embedding (array of floats) and data[].index
  • Dimensions: the length of each returned embedding array — this is the critical value. It's locked into the F32_BLOB(N) column type at table creation time and cannot be changed without a full DB reset
  • Batch size: 96 texts per API call (OpenAI's documented limit — conservative enough for all compatible providers)
  • Timeout: 30 seconds (AbortController)
  • Response parsing: indexed by item.index to handle out-of-order responses correctly
  • Dim assertion: all returned vectors are checked against the expected dimension and will throw on mismatch

Dimension detection flow

  1. If EMBEDDING_DIM is set to a number → use that value directly
  2. If EMBEDDING_DIM=auto → probe the API with a test string, store result in vectordb_meta, reuse on subsequent calls
  3. If EMBEDDING_DIM is unset → use the provider preset default (e.g. 4096 for Nebius, 1536 for OpenAI)

Once the table is created, RESOLVED_DIM is cached in memory and also persisted in vectordb_meta so it survives cold starts.


Auth Model

  • If ADMIN_TOKEN env var is not set: open mode — all operations allowed
  • If ADMIN_TOKEN is set: write operations require Authorization: Bearer <token> header
  • Read operations (/search, /get, /list, /stats, /ping, /calibrate) are always public
  • The browser CLI stores the token in sessionStorage (cleared on tab close)

Module-Level State

These variables live at module scope in vectordb.ts and reset on cold start:

VariablePurposeReset on cold start?
RESOLVED_DIMCached embedding dimensionYes — re-read from vectordb_meta
FTS_AVAILABLEWhether FTS5 is usableYes — re-detected on first call
_setupDoneSetup guard flagYes — setup re-runs once per process

Cold starts in Val Town happen frequently. These caches only help within a warm invocation window, which is fine — they prevent redundant work within a single request chain, not across requests.


Known Limitations

1. Fixed embedding dimension per database

The table schema hardcodes the vector dimension at creation time (F32_BLOB(4096)). Changing providers or models requires:

  1. POST /export to save text + meta
  2. POST /clear?confirm=yes to wipe the DB
  3. Update env vars
  4. POST /import to re-embed everything

EMBEDDING_DIM=auto helps by detecting the dimension dynamically, but once the table exists you can't change it without a full reset.

2. Naive chunking

The built-in chunkText() function splits on character count with a whitespace-finding heuristic. It doesn't respect sentence boundaries, paragraphs, or semantic units. For production RAG use cases, consider pre-chunking text before inserting.

3. Hybrid search is re-ranking only

The hybrid mode re-ranks vector results using BM25 keyword scores. It does not perform a full union of vector + keyword candidates. Records that are keyword matches but outside the vector top-K are not surfaced.

4. No cursor-based pagination

Pagination is offset-based (offset param on /list and /search). For large datasets with frequent inserts, results can shift between pages.

5. Single table / namespace

All records share one vectordb table. There's no multi-tenant or namespaced storage. If you need logical separation, use the prefix param on /list and metadata filters on /search, or fork and deploy separate instances.


Performance Benchmarks

Measured with Nebius Qwen/Qwen3-Embedding-8B (4096 dims), 105 records:

OperationLatency
Embed 10 records (1 batch)~1.2s
upsertMany 10 records~1.4s total
search (vector only)<100ms
search (hybrid)~150ms
setup() cold (first call)~200ms
setup() warm (guarded)<1ms

Storage:

  • ~22 KB per record (4096 dims × 4 bytes + float8 compression + text + overhead)
  • ~47,500 records per 1 GB

Future Improvements

Prioritized by impact:

High value

  1. Namespace/collection support — partition records by collection name (table prefix or extra column + index)
  2. Hybrid union retrieval — merge vector candidates and keyword candidates before ranking
  3. Async embed queue — background embedding via interval val, for non-blocking imports

Medium value

  1. Sentence-aware chunking — use sentence boundaries for better chunk quality
  2. Cursor pagination — stable pagination using updated_at + id cursor
  3. Webhook on upsert — fire a webhook after batch upserts complete

Low value / nice to have

  1. Multi-index — different embedding models in same DB
  2. Range filters — meta.date > "2024-01-01" style filtering
  3. Delete by filter — delete all records matching metadata criteria
  4. Rate limiting — per-IP limits on search endpoint

Environment Variables (Complete Reference)

VariableDefaultDescription
ADMIN_TOKEN—Enables auth for write ops
EMBEDDING_PROVIDERnebiusProvider preset: nebius, openai, openrouter
NEBIUS_API_KEY—Nebius key (used when provider=nebius)
OPENAI_API_KEY—OpenAI key (used when provider=openai)
OPENROUTER_API_KEY—OpenRouter key (used when provider=openrouter)
EMBEDDING_API_URL(preset)Override API URL
EMBEDDING_API_KEY—Generic fallback key
EMBEDDING_MODEL(preset)Override model name
EMBEDDING_DIM(preset)Override dimensions, or auto
INDEX_METRICcosinecosine or l2
INDEX_MAX_NEIGHBORS64Graph degree (8–256)
INDEX_COMPRESS_NEIGHBORSfloat8float8, float16, floatb16, float32, float1bit, none
INDEX_ALPHA1.2DiskANN density (≥1)
INDEX_SEARCH_L200Query-time effort
INDEX_INSERT_L70Insert-time effort
ALLOW_WRITE_TESTS0Enable /validate?write=yes
ALLOW_WRITE_TESTS_NOAUTH0Skip auth for write tests

Links

  • Val: https://www.val.town/x/kamenxrider/slimarmor
  • API: https://kamenxrider--95fbe492ffe111f0bee942dde27851f2.web.val.run
  • Browser CLI: https://kamenxrider--95fbe492ffe111f0bee942dde27851f2.web.val.run/ui
  • Module import: https://esm.town/v/kamenxrider/slimarmor/vectordb.ts
FeaturesVersion controlCode intelligenceCLIMCP
Use cases
TeamsAI agentsSlackGTM
DocsShowcaseTemplatesNewestTrendingAPI examplesNPM packages
PricingNewsletterBlogAboutCareers
We’re hiring!
Brandhi@val.townStatus
X (Twitter)
Discord community
GitHub discussions
YouTube channel
Bluesky
Open Source Pledge
Terms of usePrivacy policyAbuse contact
© 2026 Val Town, Inc.