• Townie
    AI
  • Blog
  • Docs
  • Pricing
  • We’re hiring!
Log inSign up
yawnxyz

yawnxyz

groq-docs

Public
Like
groq-docs
Home
Code
14
answer
9
data
search
16
testing
7
utils
1
.vtignore
AGENTS.md
README.md
deno.json
groq.ts
H
main.tsx
todo.md
urls.ts
utils.ts
Branches
1
Pull requests
Remixes
History
Environment variables
Val Town is a collaborative website to build and scale JavaScript apps.
Deploy APIs, crons, & store data – all from the browser, and deployed in milliseconds.
Sign up now
Code
/
search
/
README.md
Code
/
search
/
README.md
Search
…
Viewing readonly version of main branch: v98
View latest version
README.md

Search Module

Pluggable semantic search system with multiple embedding strategies.

Quick Start

1. Choose Your Strategy

The fastest option for production is local ONNX models:

# Download the model (one-time setup, ~90MB) cd models ./download-model.sh

2. Activate the Strategy

Edit search/index.ts and uncomment the desired strategy:

import { searchStrategy, generateEmbeddings } from "./transformers-local-onnx.ts";

3. Use the Search

import { searchPages } from "./search/index.ts"; const results = await searchPages("How to use Groq API?", pages, { limit: 10, minScore: 50, enableTiming: true, });

Available Strategies

StrategySpeedCostSetupBest For
transformers-local-onnx ⭐~60-80msFreeDownload modelProduction
transformers-cosine~160-180msFreeNone (auto-download)Development
mixedbread-embeddings~50-100msFree tierAPI keyHigh accuracy
openai-cosine~100-200msPaidAPI keyReliability
hf-inference-qwen3~150-300msFree tierAPI keyBest accuracy
cloudflare-bge~50-150msFree tierAPI keyCloudflare Workers
jigsawstack-orama~550msFree tierAPI keyManaged solution

⭐ = Recommended for production

Files

  • index.ts - Main entry point, switch strategies here
  • types.ts - TypeScript interfaces for search system
  • utils.ts - Shared utilities (cosine similarity, snippet generation)
  • transformers-local-onnx.ts - Local ONNX models (fastest, recommended)
  • transformers-cosine.ts - Auto-download ONNX models
  • mixedbread-embeddings-cosine.ts - Mixedbread API + local cosine
  • openai-cosine.ts - OpenAI embeddings + local cosine
  • hf-inference-qwen3-cosine.ts - HuggingFace Qwen3-8B embeddings
  • cloudflare-bge-cosine.ts - Cloudflare Workers AI
  • jigsawstack-orama.ts - JigsawStack managed search
  • mixedbread.ts - Mixedbread Stores (managed)
  • placeholder.ts - Fake embeddings for testing

Documentation

  • models/README.md - Model setup instructions
  • models/SETUP.md - Detailed setup guide
  • STRATEGY-COMPARISON.md - Detailed comparison of all strategies

Testing

Test the local ONNX model:

cd models deno run --allow-read --allow-env --allow-net test-local-model.ts

Run the full search harness:

cd ../testing deno run --allow-read --allow-env --allow-net test-search.ts

API

Search Function

async function searchPages( query: string, pages: Page[], options?: SearchOptions ): Promise<SearchResult[]>

Options:

  • limit: Maximum results to return (default: 10)
  • minScore: Minimum similarity score 0-100 (default: 0)
  • enableTiming: Log timing breakdown (default: false)

Returns: Array of search results sorted by relevance

Generate Embeddings

async function generateEmbeddings( content: string ): Promise<number[] | null>

Returns: 384-dimensional embedding vector (or configured dimensions)

Architecture

Query
  ↓
Generate Query Embedding (10-30ms)
  ↓
Compare with Page Embeddings (cosine similarity, <1ms per page)
  ↓
Sort by Similarity
  ↓
Generate Snippets
  ↓
Return Results

Performance Tips

  1. Use local ONNX models for production (fastest, most reliable)
  2. Pre-calculate embeddings during recalculation (don't generate at query time)
  3. Cache the pipeline (automatically done, but worth noting)
  4. Use quantized models if memory is constrained (set USE_QUANTIZED = true)
  5. Adjust minScore to filter low-quality results

Deployment

Deno Deploy / Render / Railway

✅ Use transformers-local-onnx.ts

  • Include model files in deployment
  • Fast, reliable, no network calls

Cloudflare Workers

✅ Use cloudflare-bge-cosine.ts

  • Workers have size limits (can't fit local models)
  • Cloudflare AI is optimized for Workers

Val.town

✅ Use transformers-cosine.ts

  • Isolate has caching for downloaded models
  • No persistent file system for pre-downloaded models

Docker

✅ Use transformers-local-onnx.ts

  • Include models in image:
    COPY search/models/all-MiniLM-L6-v2 /app/search/models/all-MiniLM-L6-v2

Troubleshooting

"Failed to load local ONNX model"

Download the model:

cd search/models ./download-model.sh

"Module not found"

Check the import in search/index.ts:

import { searchStrategy, generateEmbeddings } from "./transformers-local-onnx.ts";

Slow performance

  1. Check which strategy is active
  2. Ensure model is cached (first run is slower)
  3. Try quantized model (USE_QUANTIZED = true)
  4. Check network latency (for API-based strategies)

Different embedding dimensions

If switching strategies with different dimensions:

GET /cache/recalculate?force=true

This regenerates all embeddings with the new strategy.

Contributing

To add a new search strategy:

  1. Create search/my-strategy.ts
  2. Implement SearchStrategy interface:
    export const searchStrategy: SearchStrategy = { name: "my-strategy", description: "Description...", search: async (query, pages, options) => { // Implementation }, }; export const generateEmbeddings = async (content: string) => { // Generate embeddings };
  3. Document in STRATEGY-COMPARISON.md
  4. Add to index.ts as an option

License

Part of the groq-docs project.

FeaturesVersion controlCode intelligenceCLI
Use cases
TeamsAI agentsSlackGTM
DocsShowcaseTemplatesNewestTrendingAPI examplesNPM packages
PricingNewsletterBlogAboutCareers
We’re hiring!
Brandhi@val.townStatus
X (Twitter)
Discord community
GitHub discussions
YouTube channel
Bluesky
Open Source Pledge
Terms of usePrivacy policyAbuse contact
© 2025 Val Town, Inc.