Groq Docs API

A Hono API server that fetches, caches, and processes Groq documentation pages with token counting and AI-generated metadata.

Features

Fetches documentation pages from Groq's console
Caches page content, metadata, token counts, and embeddings in SQLite
Token counting using tiktoken (GPT-4 encoding)
AI-generated metadata (categories, tags, use cases, sample questions)
Content embeddings generation (currently fake, ready for Groq API integration)
Hash-based change detection to skip unchanged pages during recalculation
Rate limiting with async-sema to avoid WAF blocking
RESTful API endpoints for accessing pages and managing cache
Modular code structure (utils.ts for utilities, groq.ts for Groq API functions)

First-Time Setup

1. Initial Cache Population

On first run, the cache will be empty. You should populate it by running:

GET /cache/recalculate

This will:

Fetch all pages from the URLs list
Calculate token counts for each page
Generate AI metadata (categories, tags, use cases, questions)
Generate embeddings for each page
Calculate content hashes for change detection
Store everything in the SQLite cache
Return a summary of what was cached

Important: This will take some time as it processes all pages, generates metadata, and calculates tokens for each. Be patient!

Note: On subsequent runs, unchanged pages (detected by content hash) will be automatically skipped unless you use force mode.

2. Verify Cache

Check that the cache was populated:

GET /cache/stats

This returns:

{
  "cachedPages": 121,
  "totalTokens": 1234567
}

When to Recalculate

You should run /cache/recalculate in these scenarios:

✅ Required Recalculations

First time setup - Cache is empty
URL list changes - You've added or removed URLs from the urls array
Content updates - Documentation pages have been updated and you want fresh data
Token count needed - You need accurate token counts for new content
Metadata refresh - You want to regenerate AI metadata or embeddings

🔄 Default Mode (Smart Recalculation)

By default, /cache/recalculate uses hash-based change detection:

GET /cache/recalculate

Behavior:

Fetches each page and calculates its content hash (SHA-256)
Compares hash with cached version
Skips pages with unchanged content (saves time and API calls)
Only processes pages that have changed
Still generates embeddings and metadata for changed pages

Response includes:

processed - Number of pages actually processed
skipped - Number of pages skipped (unchanged)
force - Always false in default mode

⚡ Force Mode (Recalculate Everything)

To force recalculation of all pages (ignoring hash checks):

GET /cache/recalculate?force=true

Use cases:

Regenerating all metadata/embeddings even if content unchanged
After updating metadata generation prompts
When you want to ensure everything is fresh

⚠️ Partial Updates

For single page updates, you can use:

GET /cache/clear/:path

This clears the cache for a specific page. The next time that page is requested via /page/:path, it will be fetched fresh and recached.

🔄 Routine Maintenance

Weekly: Run recalculate (default mode) to catch any documentation updates efficiently
After major docs changes: Use force mode to regenerate everything
When adding new pages: Update the urls array, then run recalculate

API Endpoints

Page Endpoints

`GET /page/docs`

Get the root docs page (cached if available).

`GET /page/:path`

Get a specific page by path. Examples:

/page/api-reference
/page/agentic-tooling/compound-beta
/page/model/llama-3.1-8b-instant

Response includes:

url - The source URL
content - Full page content with frontmatter
charCount - Character count
tokenCount - Token count (calculated with tiktoken)
All frontmatter fields flattened (title, description, image, etc.)

Caching: Responses are cached. First request fetches and caches, subsequent requests are instant.

`GET /list`

Get a list of all available page paths.

Response:

[
  "docs",
  "agentic-tooling",
  "api-reference",
  ...
]

`GET /search`

Search pages by query string.

Query Parameters:

q (required) - Search query string
limit (optional) - Maximum number of results (default: 10)
minScore (optional) - Minimum score threshold (default: 0)

Example:

GET /search?q=authentication&limit=5

Response:

{
  "query": "authentication",
  "results": [
    {
      "path": "api-reference",
      "url": "https://console.groq.com/docs/api-reference.md",
      "title": "API Reference",
      "score": 45,
      "snippet": "...authentication tokens are required for all API requests..."
    },
    {
      "path": "quickstart",
      "url": "https://console.groq.com/docs/quickstart.md",
      "title": "Quick Start",
      "score": 32,
      "snippet": "...get your API key for authentication..."
    }
  ],
  "totalResults": 2,
  "totalPages": 121
}

Search Features:

Keyword matching in titles and content
Metadata boost (tags, categories, use cases)
Score-based ranking
Content snippets around matches
Uses cached pages when available for faster results

Note: Currently uses keyword-based search. Future versions will use embeddings for semantic search.

`GET /data`

Get metadata for all pages (does not use cache - fetches fresh).

Response:

{
  "pages": [
    {
      "url": "...",
      "charCount": 1234,
      "frontmatter": {...}
    }
  ],
  "contents": [...],
  "totalPages": 121,
  "totalChars": 1234567
}

Cache Management Endpoints

`GET /cache/stats`

Get cache statistics.

Response:

{
  "cachedPages": 121,
  "totalTokens": 1234567
}

`GET /cache/clear`

Clear the entire cache.

Response:

{
  "message": "Cache cleared",
  "success": true
}

`GET /cache/clear/:path`

Clear cache for a specific page.

Example:

GET /cache/clear/api-reference

Response:

{
  "message": "Cache cleared for api-reference",
  "success": true
}

`GET /cache/recalculate`

Recalculate pages with AI metadata and embeddings generation.

Query Parameters:

force (optional): Set to true to force recalculation of all pages, ignoring hash checks

Default Mode (no query params):

GET /cache/recalculate

Force Mode:

GET /cache/recalculate?force=true

Response (Default Mode):

{
  "message": "Recalculated 5 pages, skipped 116 unchanged pages",
  "results": [
    {
      "path": "api-reference",
      "url": "https://console.groq.com/docs/api-reference.md",
      "charCount": 1234,
      "tokenCount": 567,
      "title": "API Reference",
      "metadata": {
        "categories": ["API", "Reference"],
        "tags": ["api", "endpoints", "rest"],
        "useCases": ["Integrating with Groq API"],
        "questions": ["How do I authenticate?", "What endpoints are available?"]
      }
    },
    {
      "path": "docs",
      "skipped": true,
      "reason": "Content unchanged (hash matches)"
    }
  ],
  "totalPages": 121,
  "processed": 5,
  "skipped": 116,
  "withMetadata": 5,
  "withoutMetadata": 0,
  "cached": true,
  "force": false
}

Response (Force Mode):

{
  "message": "Recalculated 121 pages with AI metadata (force mode)",
  "results": [...],
  "totalPages": 121,
  "processed": 121,
  "skipped": 0,
  "force": true
}

What it does:

Fetches all pages (or skips unchanged ones in default mode)
Calculates token counts
Generates AI metadata (categories, tags, use cases, questions)
Generates embeddings (currently fake, ready for Groq API)
Calculates content hashes for change detection
Stores everything in cache

Important: This can take several minutes depending on:

Number of pages to process (skipped pages are fast)
Network speed
Token calculation time
AI metadata generation time (uses Groq API)

Cache Behavior

How Caching Works

First Request:
- Check cache → Not found
- Fetch from URL
- Calculate tokens
- Store in cache
- Return data
Subsequent Requests:
- Check cache → Found
- Return cached data immediately

Cache Storage

Cache is stored in SQLite with the following schema:

CREATE TABLE groq_docs_cache_v3 (
  url TEXT PRIMARY KEY,
  content TEXT NOT NULL,
  charCount INTEGER NOT NULL,
  tokenCount INTEGER,
  frontmatter TEXT NOT NULL,
  metadata TEXT,
  contentHash TEXT,
  embeddings TEXT,
  cachedAt INTEGER NOT NULL
)

Fields:

url - Source URL (primary key)
content - Full page content with frontmatter
charCount - Character count
tokenCount - Token count (calculated with tiktoken)
frontmatter - Parsed frontmatter (JSON)
metadata - AI-generated metadata (categories, tags, use cases, questions)
contentHash - SHA-256 hash of content (for change detection)
embeddings - Content embeddings vector (JSON array)
cachedAt - Timestamp when cached

Cache Invalidation

Cache is invalidated when:

You manually clear it via /cache/clear
You recalculate via /cache/recalculate
Cache is cleared for a specific page via /cache/clear/:path

Note: Cache does NOT automatically expire. If documentation changes, you must manually recalculate.

Adding New Pages

Add URL to the urls array in main.tsx:

const urls = [
  // ... existing URLs
  "https://console.groq.com/docs/new-page.md",
];

Run recalculate:
```
POST /cache/recalculate
```

Verify:

GET /cache/stats
GET /list  # Should include your new page

Token Counting

Token counts are calculated using tiktoken with the gpt-4 encoding (cl100k_base). This is the same encoding used by:

GPT-4
GPT-3.5-turbo
Many other OpenAI models

Token counts are:

Calculated on first fetch
Stored in cache
Returned in API responses
Expensive to compute (which is why caching is important)

AI Metadata Generation

Each page can have AI-generated metadata using Groq's chat completions API:

Categories: 2-4 broad categories (e.g., "API", "Authentication", "Models")
Tags: 5-10 specific tags/keywords
Use Cases: 2-4 practical use cases or scenarios
Questions: 5-10 questions users might ask

Metadata is generated during /cache/recalculate and stored in the cache.

Search

The API includes a search endpoint (/search) that allows you to search across all documentation pages.

Current Implementation (Keyword-Based)

Currently uses keyword matching:

Searches in page titles and content
Boosts results matching metadata (tags, categories, use cases)
Returns ranked results with relevance scores
Includes content snippets around matches

Future Implementation (Semantic Search)

The search system is designed to support embeddings-based semantic search:

generateEmbeddings() - Generates embeddings (currently fake, ready for real API)
vectorSearch() - Vector similarity search function (ready to use when embeddings are real)
Will enable semantic understanding of queries (not just keyword matching)

Embeddings

Content embeddings are generated for each page. Currently using a fake implementation (deterministic 384-dimensional vectors) that's ready to be replaced with actual embeddings API when available.

Embeddings are:

Generated during recalculation
Stored in cache
Will be used for semantic search and similarity matching (currently using keyword search)

Hash-Based Change Detection

Content hashes (SHA-256) are calculated and stored for each page. This enables:

Smart recalculation: Skip unchanged pages automatically
Efficient updates: Only process pages that have actually changed
Performance: Significantly faster recalculation when most content is unchanged

Hashes are compared during /cache/recalculate (default mode) to determine if a page needs reprocessing.

Troubleshooting

Cache seems stale

Run /cache/recalculate to refresh everything.

Page not found

Check /list to see if the path exists
Verify the URL is in the urls array
Ensure the path matches the URL structure (e.g., api-reference for /docs/api-reference.md)

Token counts seem wrong

Clear cache for that page: POST /cache/clear/:path
Request the page again: GET /page/:path
Or recalculate everything: POST /cache/recalculate

Performance issues

Use /page/:path endpoints (cached) instead of /data (uncached)
Check cache stats: GET /cache/stats
Ensure cache is populated before production use

Code Structure

The codebase is organized into modular files:

main.tsx - Main Hono app, routes, and URL definitions
utils.ts - Utility functions:
- Cache management (getFromCache, setCache, clearCache, getCacheStats)
- Content fetching (getTextFromUrl)
- Frontmatter parsing (parseFrontmatter, addUrlSourceToFrontmatter)
- Token counting (calculateTokenCount)
- Hash calculation (calculateContentHash)
- Rate limiting for fetches
groq.ts - Groq API functions:
- Chat completions (groqChatCompletion)
- Metadata generation (generatePageMetadata)
search.ts - Search and embeddings utilities:
- Embeddings generation (generateEmbeddings) - fake implementation ready for real API
- Search functions (searchPages) - keyword-based search (will use embeddings later)
- Vector similarity search (vectorSearch) - ready for embeddings-based search

Development

Local Development

deno run --allow-net --allow-env main.tsx

Note: SQLite caching is automatically disabled when running locally (detected via valtown environment variable). The app will work without caching, but cache-related endpoints will return appropriate messages.

Val Town

The app is configured to work with Val Town. Export uses:

export default (typeof Deno !== "undefined" && Deno.env.get("valtown")) ? app.fetch : app;

SQLite caching is automatically enabled when running in Val Town (detected via valtown environment variable).

Environment Variables

GROQ_API_KEY - Required for AI metadata generation (optional, will disable metadata if not set)
valtown - Automatically set by Val Town (detects environment)

Performance Tips

Use default recalculate mode - Automatically skips unchanged pages
Cache is your friend - Always populate cache before production use
Rate limiting - Built-in rate limiting prevents WAF blocking (1 request per 3 seconds for docs, 2 requests per second for Groq API)
Hash checking - Default recalculation mode is much faster when most content is unchanged

yawnxyz

groq-docs

Groq Docs API

Features

First-Time Setup

1. Initial Cache Population

2. Verify Cache

When to Recalculate

✅ Required Recalculations

🔄 Default Mode (Smart Recalculation)

⚡ Force Mode (Recalculate Everything)

⚠️ Partial Updates

🔄 Routine Maintenance

API Endpoints

Page Endpoints

GET /page/docs

GET /page/:path

GET /list

GET /search

GET /data

Cache Management Endpoints

GET /cache/stats

GET /cache/clear

GET /cache/clear/:path

GET /cache/recalculate

Cache Behavior

How Caching Works

Cache Storage

Cache Invalidation

Adding New Pages

Token Counting

AI Metadata Generation

Search

Current Implementation (Keyword-Based)

Future Implementation (Semantic Search)

Embeddings

Hash-Based Change Detection

Troubleshooting

Cache seems stale

Page not found

Token counts seem wrong

Performance issues

Code Structure

Development

Local Development

Val Town

Environment Variables

Performance Tips

`GET /page/docs`

`GET /page/:path`

`GET /list`

`GET /search`

`GET /data`

`GET /cache/stats`

`GET /cache/clear`

`GET /cache/clear/:path`

`GET /cache/recalculate`