Backend API

Hono-based API server for the Sitemap Crawler application.

Endpoints

`POST /api/crawl`

Crawls a main sitemap, looks for a posts sitemap, and searches for "extendedrecipe" string on pages.

Request Body:

{
  "sitemapUrl": "https://example.com/sitemap.xml"
}

Response:

{
  "found": true,
  "foundUrl": "https://example.com/page-with-string",
  "totalCrawled": 25,
  "errors": [],
  "crawledUrls": ["url1", "url2", ...],
  "postsSitemapUrl": "https://example.com/post-sitemap.xml",
  "postsSitemapFound": true
}

Features:

First looks for posts sitemap in main sitemap
Falls back to main sitemap if no posts sitemap found
Limits crawling to maximum 50 URLs
Stops immediately when target string is found
Returns detailed error information
Includes list of all crawled URLs
Shows which sitemap was actually crawled

`GET /`

Serves the frontend application.

Static File Serving

/frontend/* - Frontend assets
/shared/* - Shared utilities

Implementation Details

First attempts to find posts sitemap from main sitemap index
Looks for URLs containing "post" and ending with .xml
Falls back to crawling main sitemap if no posts sitemap found
Uses simple regex parsing for sitemap XML
Includes User-Agent header for better compatibility
Case-insensitive string matching
Comprehensive error handling