IndexNow Automation Val

This val automates IndexNow submission by:

Crawling sitemap URLs.
Tracking submission history in project-scoped SQLite.
Submitting only changed URLs.

Quick Start

Read the IndexNow docs (indexnow.org/documentation) and make sure your key file is already hosted on your domain at a public URL like https://www.example.org/<key>.txt.
Add at least one site file in config/sites/*.json (see below for details).
Set required env vars (see below for details).
Open / to confirm the dashboard loads.
Run a safe dry run first: POST /run?dryRun=1 with x-indexnow-token.
After dry run looks correct, enable cron via indexnow.cron.ts for ongoing automation.

Trigger Model

indexnow.http.ts: primary HTTP entrypoint.
indexnow.cron.ts: scheduled trigger that POSTs to the HTTP handler at /run.

Environment Variables

INDEXNOW_RUN_TOKEN (required): auth token required for /run (manual and cron).
INDEXNOW_FORCE_TOKEN (optional but recommended): separate auth token required when using force=1.
INDEXNOW_ENDPOINT (optional): defaults to https://api.indexnow.org/indexnow.
INDEXNOW_PER_SITE_RUN_URL_LIMIT (optional): defaults to 5000; per-site queue cap per run before batching into IndexNow POSTs.
INDEXNOW_ALERTS_ENABLED (optional): defaults to enabled; set to 0/false to disable alert emails.
INDEXNOW_PUBLIC_BASE_URL (optional but recommended): absolute public val URL used for alert-email deep links (for example https://<your-val>.web.val.run).
INDEXNOW_KEY_LOCATION_<SITE> (recommended): per-site key file URL consumed via keyLocationEnvVar in each site config (e.g. INDEXNOW_KEY_LOCATION_HANAYOU).
INDEXNOW_ALERT_TO_<SITE> (optional): per-site alert recipient email (e.g. INDEXNOW_ALERT_TO_HANAYOU).

Site Config Files

Site configs live in config/sites/*.json with one site per file. At least one file must exist or the val will throw on startup.

Example: config/sites/hanayou.studio.json

{
  "host": "www.hanayou.studio",
  "sitemapUrl": "https://www.hanayou.studio/sitemap.xml",
  "keyLocationEnvVar": "INDEXNOW_KEY_LOCATION_HANAYOU",
  "alertToEnvVar": "INDEXNOW_ALERT_TO_HANAYOU",
  "includePrefixes": ["/journal/", "/guides/"],
  "excludePrefixes": ["/tag/", "/author/"]
}

Notes:

Keep raw email addresses out of config/sites/*.json; set as an env var instead.
Each site must set exactly one key-location source: keyLocation or keyLocationEnvVar.
Prefer keyLocationEnvVar over inline keyLocation so key-bearing URLs are not committed.
keyLocationEnvVar values may be a full URL, a bare key (for example f9dbe...), or key.txt. Bare values are normalized to https://<site-host>/<key>.txt.
URLs with noindex directives are skipped automatically (supports both X-Robots-Tag and <meta name="robots"> / <meta name="googlebot">).
Use excludePrefixes for archive/utility paths you never want submitted (for example /author/, /tag/, /page/).
Queue candidates are semantically compared against prior snapshots; cache-busting query params like ?v=... are normalized, and header metadata changes (ETag, Last-Modified) alone do not trigger submission.
Dry runs are read-only for URL state metrics (upToDate / needsSubmission) and do not update submission-tracking state.
If both keyLocation and keyLocationEnvVar are set, startup fails (choose one).
If alertToEnvVar is omitted (or env var is empty), the val sends alerts to the default owner email.

API Reference

Base URL pattern: https://<your-val>.web.val.run

Auth model:

POST /run requires INDEXNOW_RUN_TOKEN.
POST /run?force=1 also requires INDEXNOW_FORCE_TOKEN.
POST /backfill requires INDEXNOW_RUN_TOKEN.
GET / and GET /monitor are read-only and public.

GET `/`

Purpose: Render the React operator dashboard app shell.

Auth: none.

Query parameters:

Name	Required	Type	Description
`site`	no	string	Filter the dashboard to a single domain
`runs`	no	`50` or `500`	Run-history depth shown in tables.
`issues`	no	`1`/`true`/`yes`	Issue-focused mode (forces run history depth to 500).
`updates`	no	`1`/`true`/`yes`	Updates-focused mode (forces run history depth to 500).
`run`	no	integer	Pre-expand one run row in the UI.
`diffUrl`	no	full URL	With `run`, open the Raw HTML Diff panel for one URL.

Response:

200 text/html
Runtime/config errors bubble as server errors.

Notes:

/ is the only dashboard UI route.
Dashboard interactions fetch data from GET /monitor and update URL query params without full page reloads.
With no scope query params (issues, updates, runs), the dashboard defaults to Runs with Updates (updates=1, runs=500).

Example:

https://<your-val>.web.val.run/?site=www.hanayou.studio&runs=50

POST `/run`

Purpose: Execute sitemap discovery + IndexNow submission workflow.

Auth headers:

Header	Required	Description
`Authorization: Bearer <INDEXNOW_RUN_TOKEN>`	yes (or `x-indexnow-token`)	Primary run token.
`x-indexnow-token: <INDEXNOW_RUN_TOKEN>`	yes (or Bearer)	Alternate primary run token header.
`x-indexnow-force-token: <INDEXNOW_FORCE_TOKEN>`	conditional	Required only when `force=1`.
`x-indexnow-reason: cron_entrypoint`	reserved	Used by `indexnow.cron.ts` for cron provenance.

Query parameters:

Name	Required	Type	Description
`dryRun`	no	`1` or omitted	Simulate run with no submission side effects.
`force`	no	`1` or omitted	Intentionally resubmit currently eligible unchanged URLs.
`site`	no	string	Restrict run to one configured host.

Status codes:

Code	Meaning
`200`	Run completed and `hasErrors=false`.
`500`	Run completed but `hasErrors=true` for at least one site.
`401`	Missing/invalid run token.
`403`	`force=1` without valid/configured force token.
`405`	Method not allowed (`POST` required).
`503`	Server misconfigured (`INDEXNOW_RUN_TOKEN` missing).

Pragmatic response fields:

Top-level: reason (manual_run_endpoint or cron_entrypoint), dryRun, force, runUrlLimit, hasErrors, startedAt, finishedAt.
Per site (siteRuns[]): runId, host, filteredCount, changedCount, queuedCount, submittedCount, pendingCount, deferredCount, errors.
Per site URL arrays: filteredUrls, changedUrls, queuedUrls.

Examples:

Dry run:

curl -X POST "https://<your-val>.web.val.run/run?dryRun=1&site=www.hanayou.studio" \
  -H "x-indexnow-token: <INDEXNOW_RUN_TOKEN>"

Real run:

curl -X POST "https://<your-val>.web.val.run/run?site=www.hanayou.studio" \
  -H "x-indexnow-token: <INDEXNOW_RUN_TOKEN>"

Force run:

curl -X POST "https://<your-val>.web.val.run/run?force=1&site=www.hanayou.studio" \
  -H "x-indexnow-token: <INDEXNOW_RUN_TOKEN>" \
  -H "x-indexnow-force-token: <INDEXNOW_FORCE_TOKEN>"

POST `/backfill`

Purpose: Backfill one baseline raw-HTML snapshot per previously submitted URL that does not already have snapshot content. It captures current live HTML only and does not reconstruct historical page HTML.

Auth headers:

Header	Required	Description
`Authorization: Bearer <INDEXNOW_RUN_TOKEN>`	yes (or `x-indexnow-token`)	Primary run token.
`x-indexnow-token: <INDEXNOW_RUN_TOKEN>`	yes (or Bearer)	Alternate primary run token header.

Query parameters:

Name	Required	Type	Description
`dryRun`	no	`1`, `0`, or omitted	Defaults to `1` when omitted; preview candidate URLs without writing snapshots (`dryRun` is camelCase).
`site`	no	string	Restrict backfill to one configured host.
`limit`	no	integer	Max candidate URLs to process per site in one request.

Status codes:

Code	Meaning
`200`	Backfill completed and `hasErrors=false`.
`500`	Backfill completed with fetch/write issues (`hasErrors=true`).
`400`	Invalid `dryRun` or `limit` query values.
`401`	Missing/invalid run token.
`405`	Method not allowed (`POST` required).
`503`	Server misconfigured (`INDEXNOW_RUN_TOKEN` missing).

Pragmatic response fields:

Top-level: dryRun, limitPerSite, siteBackfills[], hasErrors, startedAt, finishedAt.
Per site (siteBackfills[]): candidatesInBatch, insertedCount, fetchErrorCount, remainingAfterBatch, sampledUrls, errors.
Inserted snapshot rows use run_id = NULL and snapshot_source = "backfill" (one baseline per URL).

Examples:

Dry-run preview (safe default):

curl -X POST "https://<your-val>.web.val.run/backfill?site=www.hanayou.studio" \
  -H "x-indexnow-token: <INDEXNOW_RUN_TOKEN>"

Real backfill batch:

curl -X POST "https://<your-val>.web.val.run/backfill?dryRun=0&site=www.hanayou.studio&limit=50" \
  -H "x-indexnow-token: <INDEXNOW_RUN_TOKEN>"

GET `/monitor`

Purpose: Return JSON health/status data for automation and debugging.

Auth: none.

Notes:

/monitor is JSON-only and does not render the dashboard UI.
With no scope query params (issues, updates, runs), /monitor defaults to all-runs view with runs=50.

Query parameters:

Name	Required	Type	Description
`site`	no	string	Limit output to one configured host.
`runs`	no	`50` or `500`	Number of recent run records per site.
`issues`	no	`1`/`true`/`yes`	Issue-focused mode (forces run history depth to 500).
`updates`	no	`1`/`true`/`yes`	Updates-focused mode (forces run history depth to 500).
`run`	no	integer	Include per-URL statuses for one selected run when available.
`diffUrl`	no	full URL	With `run`, include raw-HTML snapshot metadata + unified diff for that URL.

Status codes:

Code	Meaning
`200`	Monitor snapshot returned.
`500`	Runtime/config failure while building snapshot (error bubbles).

Pragmatic response fields:

Top-level: generatedAt, runHistoryLimit, issuesOnly, updatesOnly, selectedHost, selectedRunId, selectedDiffUrl, siteHosts.
Aggregate status: totals (upToDateUrls, needsSubmissionUrls, retryPendingUrls).
Run history: sites[].latestRun, sites[].recentRuns.
Drill-down: sites[].selectedRunUrlStatuses (when run is provided and data exists).
- URL outcomes include: submitted, pending, failed, deferred, dry_run, ignored, suppressed.
- ignored = intentionally ineligible by policy/config/robots.
- suppressed = intentionally withheld by optimization logic (currently semantic unchanged).
Diff drill-down: sites[].selectedRunDiff (when run + diffUrl are provided and snapshots exist).
If a URL has only one snapshot, metadata is shown but no diff body is available yet.
If the selected run has no run-linked snapshot for that URL, the UI may fall back to the latest available snapshot for inspection.

Example:

curl "https://<your-val>.web.val.run/monitor?site=www.hanayou.studio&runs=500&issues=1"

Interpretation Notes

submittedCount counts successful IndexNow HTTP 200 responses.
pendingCount tracks IndexNow HTTP 202 responses.
deferredCount means changed URLs were capped by INDEXNOW_PER_SITE_RUN_URL_LIMIT.
unchanged sitemap lastmod is normal no-op behavior and is intentionally not emitted as ignored or suppressed.

History Tracking

SQLite tables:

indexnow_url_state_v1: per-URL state (seen/submitted timestamps, lastmod, status).
indexnow_run_log_v1: per-run summaries for observability and debugging.
indexnow_url_snapshot_v1: queued-URL raw HTML snapshots + metadata for diff inspection.

Snapshot retention policy (hard-coded):

Keep snapshots for up to 14 days.
Always keep at least 3 snapshots per URL (even if older than 14 days).

This val uses https://esm.town/v/std/sqlite/main.ts so data is scoped to this val project.

Run URL Limit

INDEXNOW_PER_SITE_RUN_URL_LIMIT is applied per site for each run.
If more changed URLs are discovered than the limit, the remainder is deferred to future runs.
To intentionally resubmit unchanged URLs (still within limit), use force=1 with x-indexnow-force-token.

Ops Runbook

Daily/incident check:
- Open / dashboard and confirm site health cards are green.
- Review recent runs for pending, deferred, or Error statuses.
Manual safe run flow:
- Run dry run first: POST /run?dryRun=1.
- Review changedCount and queuedCount.
- If expected, run real submission: POST /run.
If deferred URLs accumulate:
- Confirm this is expected (large content release, migration, etc.).
- Tune INDEXNOW_PER_SITE_RUN_URL_LIMIT and/or schedule frequency.
- Use force=1 only when you intentionally want to resubmit unchanged URLs.
Alert triage:
- Alert with pending: retry/watch next run; investigate endpoint throttling.
- Alert with deferred: consider run URL-limit tuning or additional runs.
- Alert with errors: requires action before trusting indexing progress.
External verification:
- Use Bing Webmaster Tools to verify ingestion trends.

ashryanio

index-now

IndexNow Automation Val

Quick Start

Trigger Model

Environment Variables

Site Config Files

API Reference

GET `/`

POST `/run`

POST `/backfill`

GET `/monitor`

Interpretation Notes

History Tracking

Run URL Limit

Ops Runbook

ashryanio

index-now

IndexNow Automation Val

Quick Start

Trigger Model

Environment Variables

Site Config Files

API Reference

GET /

POST /run

POST /backfill

GET /monitor

Interpretation Notes

History Tracking

Run URL Limit

Ops Runbook

GET `/`

POST `/run`

POST `/backfill`

GET `/monitor`