Search

174 results found for embeddings (2206ms)

Code
165

// Mixedbread Stores Strategy: Managed AI Search Service
// Uses Mixedbread's Store API for document storage and semantic search
// No local embeddings needed - Mixedbread handles everything
import type { SearchStrategy, SearchResult, Page, SearchOptions } from "./types.ts";
// This function is required for compatibility with the recalculation system
// For Mixedbread managed Store, embeddings are handled by Mixedbread internally
// Return a dummy array to satisfy the recalculation script
// The actual embeddings are generated and stored by Mixedbread when documents are uploaded
export const generateEmbeddings = async (_content: string): Promise<number[] | null> => {
// Return a dummy embedding array to satisfy recalculation
// The recalculation script requires this, but for Mixedbread Store,
// you should use `deno task recalc-mxbai` instead, which uploads docs to Mixedbread
return [0]; // Dummy value - actual embeddings are handled by Mixedbread Store
};
export const searchStrategy: SearchStrategy = {
name: "mixedbread",
ed AI search using Mixedbread Stores (handles storage, embeddings, and search)",
search: async (query: string, _pages: Page[], options: SearchOptions = {}): Promise<SearchResu
const limit = options.limit || 10;
```typescript
import { searchStrategy, generateEmbeddings } from "./transformers-local-onnx.ts";
```
| **transformers-local-onnx** ⭐ | ~60-80ms | Free | Download model | Production |
| **transformers-cosine** | ~160-180ms | Free | None (auto-download) | Development |
| **mixedbread-embeddings** | ~50-100ms | Free tier | API key | High accuracy |
| **openai-cosine** | ~100-200ms | Paid | API key | Reliability |
| **hf-inference-qwen3** | ~150-300ms | Free tier | API key | Best accuracy |
- **`transformers-local-onnx.ts`** - Local ONNX models (fastest, recommended)
- **`transformers-cosine.ts`** - Auto-download ONNX models
- **`mixedbread-embeddings-cosine.ts`** - Mixedbread API + local cosine
- **`openai-cosine.ts`** - OpenAI embeddings + local cosine
- **`hf-inference-qwen3-cosine.ts`** - HuggingFace Qwen3-8B embeddings
- **`cloudflare-bge-cosine.ts`** - Cloudflare Workers AI
- **`jigsawstack-orama.ts`** - JigsawStack managed search
- **`mixedbread.ts`** - Mixedbread Stores (managed)
- **`placeholder.ts`** - Fake embeddings for testing
## Documentation
**Returns**: Array of search results sorted by relevance
### Generate Embeddings
```typescript
async function generateEmbeddings(
content: string
): Promise<number[] | null>
Generate Query Embedding (10-30ms)
Compare with Page Embeddings (cosine similarity, <1ms per page)
Sort by Similarity
1. **Use local ONNX models** for production (fastest, most reliable)
2. **Pre-calculate embeddings** during recalculation (don't generate at query time)
3. **Cache the pipeline** (automatically done, but worth noting)
4. **Use quantized models** if memory is constrained (set `USE_QUANTIZED = true`)
Check the import in `search/index.ts`:
```typescript
import { searchStrategy, generateEmbeddings } from "./transformers-local-onnx.ts";
```
```
This regenerates all embeddings with the new strategy.
## Contributing
};
export const generateEmbeddings = async (content: string) => {
// Generate embeddings
};
```
```
You can then use the model to compute embeddings like this:
```js
const extractor = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
// Compute sentence embeddings
const sentences = ['This is an example sentence', 'Each sentence is converted'];
const output = await extractor(sentences, { pooling: 'mean', normalize: true });
"intermediate_size": 1536,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "bert",
"num_attention_heads": 12,
echo "Next steps:"
echo "1. Update search/index.ts to use the local ONNX strategy:"
echo " import { searchStrategy, generateEmbeddings } from \"./transformers-local-onnx.ts\"
echo ""
echo "2. Run your application - the model will load from local files!"
```typescript
import { searchStrategy, generateEmbeddings } from "./transformers-local-onnx.ts";
```
// Run with: deno run --allow-read --allow-env --allow-net --allow-ffi test-server-mode.ts
import { searchStrategy, generateEmbeddings } from "../transformers-local-onnx.ts";
console.log("🧪 Testing ONNX in Server Mode (long-running process)\n");
const start = performance.now();
const embedding = await generateEmbeddings(query);
const elapsed = performance.now() - start;
// Run with: deno run --allow-read --allow-env --allow-net test-local-model.ts
import { searchStrategy, generateEmbeddings } from "../transformers-local-onnx.ts";
console.log("🧪 Testing Local ONNX Model Strategy\n");
// Test 1: Generate embeddings for a simple query
console.log("Test 1: Generate embeddings for a query");
console.log("Query: 'What is Groq?'\n");
const start = performance.now();
const embeddings = await generateEmbeddings("What is Groq?");
const elapsed = performance.now() - start;
if (embeddings) {
console.log(`✅ Generated embeddings successfully!`);
console.log(` Dimensions: ${embeddings.length}`);
console.log(` First 5 values: [${embeddings.slice(0, 5).map(v => v.toFixed(4)).join(", ")}..
console.log(` Time: ${elapsed.toFixed(2)}ms`);
} else {
console.log(`❌ Failed to generate embeddings`);
}
console.log("\n" + "=".repeat(60) + "\n");
// Test 2: Generate embeddings for multiple queries (to test caching)
console.log("Test 2: Generate embeddings for multiple queries (testing cache)");
const queries = [
for (const query of queries) {
const queryStart = performance.now();
const queryEmbedding = await generateEmbeddings(query);
const queryElapsed = performance.now() - queryStart;
title: "Introduction to Groq",
content: "Groq is a fast AI inference platform that provides APIs for various language model
embeddings: await generateEmbeddings("Groq is a fast AI inference platform that provides API
},
{
title: "API Keys",
content: "Learn how to create and manage your Groq API keys for authentication.",
embeddings: await generateEmbeddings("Learn how to create and manage your Groq API keys for
},
{
title: "Available Models",
content: "Groq supports various language models including Llama, Mixtral, and Gemma.",
embeddings: await generateEmbeddings("Groq supports various language models including Llama,
},
];
```typescript
import { searchStrategy, generateEmbeddings } from "./transformers-local-onnx.ts";
```
```typescript
// import { searchStrategy, generateEmbeddings } from "./transformers-cosine.ts";
```
```typescript
// Comment out the current strategy
// import { searchStrategy, generateEmbeddings } from "./transformers-cosine.ts";
// Uncomment the local ONNX strategy
import { searchStrategy, generateEmbeddings } from "./transformers-local-onnx.ts";
```
total model load: 239ms
✅ Generated embeddings successfully!
Dimensions: 384
First 5 values: [-0.0457, -0.0109, -0.0935, ...]
Time: 247ms
Test 2: Generate embeddings for multiple queries (testing cache)
✅ "How to use Groq API?"
Time: 3.87ms (cached pipeline)
**Fix**: Check `search/index.ts` has the correct import:
```typescript
import { searchStrategy, generateEmbeddings } from "./transformers-local-onnx.ts";
```
- Slightly less accurate: ~57 vs ~58 MTEB score
### Optional: Recalculate Embeddings
If you were using a different strategy before, regenerate embeddings:
```bash
```
This ensures all page embeddings use the same model.
## Performance Comparison
tmcw
surprisingEmbeddings
Visualizing embedding distances
Public
maxm
emojiVectorEmbeddings
 
Public
janpaul123
blogPostEmbeddingsDimensionalityReduction
 
Public
janpaul123
compareEmbeddings
 
Public
yawnxyz
embeddingsSearchExample
 
Public

Users

No users found
No docs found