Local ONNX Models

This directory contains pre-downloaded ONNX models for local embedding generation.

Setup Instructions

1. Download the all-MiniLM-L6-v2 Model

Download the ONNX model files from Hugging Face:

# Create the model directory
mkdir -p all-MiniLM-L6-v2

# Download the ONNX model files
# You can download these manually from: https://huggingface.co/Xenova/all-MiniLM-L6-v2/tree/main/onnx
# Required files:
# - model.onnx
# - tokenizer.json
# - tokenizer_config.json
# - config.json
# - special_tokens_map.json (optional)

Or use this script to download automatically:

cd models
git clone https://huggingface.co/Xenova/all-MiniLM-L6-v2
# Clean up .git directory to save space
rm -rf all-MiniLM-L6-v2/.git

2. Verify the Files

Your directory structure should look like:

models/
  all-MiniLM-L6-v2/
    onnx/
      model.onnx          # ~23MB - the main ONNX model
      model_quantized.onnx # ~6MB - quantized version (optional, faster)
    tokenizer.json        # Tokenizer configuration
    tokenizer_config.json # Additional tokenizer settings
    config.json           # Model configuration

3. Switch to Local Model Strategy

In search/index.ts, uncomment the local model strategy:

import { searchStrategy, generateEmbeddings } from "./transformers-local-onnx.ts";

Model Details

Model: all-MiniLM-L6-v2 (Sentence Transformers)
Dimensions: 384
Size: ~23MB (full) or ~6MB (quantized)
Speed: ~10-30ms per embedding (after initial load)
Advantage: No network calls, works offline, completely local

Notes

The model files are binary and should be kept in this directory
The quantized model (model_quantized.onnx) is smaller and faster but slightly less accurate
This strategy requires the model files to be present before running
For deployment, ensure these files are included in your build/deployment package

About the "mutex lock failed" Error

You may see this error when running test scripts:

libc++abi: terminating due to uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument

This is harmless and expected! Here's why:

The error happens in ONNX runtime's C++ cleanup code when the process exits
It occurs after all work is completed successfully
It does not affect functionality, data, or results
It only appears when scripts exit immediately
Long-running servers (like your main app) typically don't exit, so you won't see this in production

What to do: Just ignore the error message. All tests pass successfully before it appears.

yawnxyz

groq-docs