Groq Docs API

A Hono API server that fetches, caches, and processes Groq documentation pages with token counting.

Features

Fetches documentation pages from Groq's console
Caches page content, metadata, and token counts in SQLite
Token counting using tiktoken (GPT-4 encoding)
RESTful API endpoints for accessing pages and managing cache

First-Time Setup

1. Initial Cache Population

On first run, the cache will be empty. You should populate it by running:

POST /cache/recalculate

This will:

Fetch all pages from the URLs list
Calculate token counts for each page
Store everything in the SQLite cache
Return a summary of what was cached

Important: This will take some time as it processes all 121+ pages and calculates tokens for each. Be patient!

2. Verify Cache

Check that the cache was populated:

GET /cache/stats

This returns:

{
  "cachedPages": 121,
  "totalTokens": 1234567
}

When to Recalculate

You should run /cache/recalculate in these scenarios:

✅ Required Recalculations

First time setup - Cache is empty
URL list changes - You've added or removed URLs from the urls array
Content updates - Documentation pages have been updated and you want fresh data
Token count needed - You need accurate token counts for new content

⚠️ Partial Updates

For single page updates, you can use:

POST /cache/clear/:path

This clears the cache for a specific page. The next time that page is requested via /page/:path, it will be fetched fresh and recached.

🔄 Routine Maintenance

Weekly: Run recalculate to catch any documentation updates
After major docs changes: Clear and recalculate
When adding new pages: Update the urls array, then run recalculate

API Endpoints

Page Endpoints

`GET /page/docs`

Get the root docs page (cached if available).

`GET /page/:path`

Get a specific page by path. Examples:

/page/api-reference
/page/agentic-tooling/compound-beta
/page/model/llama-3.1-8b-instant

Response includes:

url - The source URL
content - Full page content with frontmatter
charCount - Character count
tokenCount - Token count (calculated with tiktoken)
All frontmatter fields flattened (title, description, image, etc.)

Caching: Responses are cached. First request fetches and caches, subsequent requests are instant.

`GET /list`

Get a list of all available page paths.

Response:

[
  "docs",
  "agentic-tooling",
  "api-reference",
  ...
]

`GET /data`

Get metadata for all pages (does not use cache - fetches fresh).

Response:

{
  "pages": [
    {
      "url": "...",
      "charCount": 1234,
      "frontmatter": {...}
    }
  ],
  "contents": [...],
  "totalPages": 121,
  "totalChars": 1234567
}

Cache Management Endpoints

`GET /cache/stats`

Get cache statistics.

Response:

{
  "cachedPages": 121,
  "totalTokens": 1234567
}

`POST /cache/clear`

Clear the entire cache.

Response:

{
  "message": "Cache cleared"
}

`POST /cache/clear/:path`

Clear cache for a specific page.

Example:

POST /cache/clear/api-reference

Response:

{
  "message": "Cache cleared for api-reference"
}

`POST /cache/recalculate`

Recalculate all pages (bypasses cache, fetches fresh, recaches everything).

Response:

{
  "message": "Recalculated 121 pages",
  "results": [
    {
      "path": "api-reference",
      "url": "https://console.groq.com/docs/api-reference.md",
      "charCount": 1234,
      "tokenCount": 567
    },
    ...
  ]
}

Important: This can take several minutes depending on:

Number of pages (currently 121)
Network speed
Token calculation time

Cache Behavior

How Caching Works

First Request:
- Check cache → Not found
- Fetch from URL
- Calculate tokens
- Store in cache
- Return data
Subsequent Requests:
- Check cache → Found
- Return cached data immediately

Cache Storage

Cache is stored in SQLite with the following schema:

CREATE TABLE groq_docs_cache (
  url TEXT PRIMARY KEY,
  content TEXT NOT NULL,
  charCount INTEGER NOT NULL,
  tokenCount INTEGER,
  frontmatter TEXT NOT NULL,
  cachedAt INTEGER NOT NULL
)

Cache Invalidation

Cache is invalidated when:

You manually clear it via /cache/clear
You recalculate via /cache/recalculate
Cache is cleared for a specific page via /cache/clear/:path

Note: Cache does NOT automatically expire. If documentation changes, you must manually recalculate.

Adding New Pages

Add URL to the urls array in main.tsx:

const urls = [
  // ... existing URLs
  "https://console.groq.com/docs/new-page.md",
];

Run recalculate:
```
POST /cache/recalculate
```

Verify:

GET /cache/stats
GET /list  # Should include your new page

Token Counting

Token counts are calculated using tiktoken with the gpt-4 encoding (cl100k_base). This is the same encoding used by:

GPT-4
GPT-3.5-turbo
Many other OpenAI models

Token counts are:

Calculated on first fetch
Stored in cache
Returned in API responses
Expensive to compute (which is why caching is important)

Troubleshooting

Cache seems stale

Run /cache/recalculate to refresh everything.

Page not found

Check /list to see if the path exists
Verify the URL is in the urls array
Ensure the path matches the URL structure (e.g., api-reference for /docs/api-reference.md)

Token counts seem wrong

Clear cache for that page: POST /cache/clear/:path
Request the page again: GET /page/:path
Or recalculate everything: POST /cache/recalculate

Performance issues

Use /page/:path endpoints (cached) instead of /data (uncached)
Check cache stats: GET /cache/stats
Ensure cache is populated before production use

Development

Local Development

deno run --allow-net --allow-env main.tsx

Val Town

The app is configured to work with Val Town. Export uses:

export default (typeof Deno !== "undefined" && Deno.env.get("valtown")) ? app.fetch : app;

yawnxyz

groq-docs

Groq Docs API

Features

First-Time Setup

1. Initial Cache Population

2. Verify Cache

When to Recalculate

✅ Required Recalculations

⚠️ Partial Updates

🔄 Routine Maintenance

API Endpoints

Page Endpoints

`GET /page/docs`

`GET /page/:path`

`GET /list`

`GET /data`

Cache Management Endpoints

`GET /cache/stats`

`POST /cache/clear`

`POST /cache/clear/:path`

`POST /cache/recalculate`

Cache Behavior

How Caching Works

Cache Storage

Cache Invalidation

Adding New Pages

Token Counting

Troubleshooting

Cache seems stale

Page not found

Token counts seem wrong

Performance issues

Development

Local Development

Val Town