Playwright Selector Generator

Endpoint: https://playwright.val.run

AI-powered service that fetches any URL's DOM and generates Playwright-native locators. The AI model does all selector generation — zero algorithmic logic in the code. The code only fetches HTML, extracts raw DOM context, and passes it to the model.

Authentication

Every POST request requires the X-Api-Key header.

curl -X POST https://playwright.val.run \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: your_endpoint_key" \
  -d '{"url": "https://www.saucedemo.com"}'

Environment Variables

Env Var	Required	Purpose
`X_API_KEY`	Yes	Protects the endpoint
`ANTHROPIC_API_KEY`	Yes	Z.AI / Anthropic API key for the model
`ANTHROPIC_URL`	No	API endpoint (default: `https://api.z.ai/api/anthropic/v1/messages`)
`ANTHROPIC_MODEL`	No	Model identifier (default: `anthropic/claude-sonnet-4-20250514`)
`FIRECRAWL_API_KEY`	Recommended	Enables headless browser fetching via Firecrawl

HTML Fetching

The service fetches page HTML using two strategies:

With FIRECRAWL_API_KEY (recommended): Uses Firecrawl's headless browser API. Renders JavaScript, waits 3 seconds for dynamic content, handles service workers and anti-bot measures. Required for sites like saucedemo.com that serve an empty HTML shell to server-side fetches but render full content in a browser.

Without FIRECRAWL_API_KEY: Falls back to raw fetch(). Works for fully server-rendered pages (Wikipedia, static sites). Fails silently for SPAs — you'll get a 422 with diagnostic info.

The response includes a fetchMethod field ("firecrawl" or "fetch") so you can verify which was used.

Get a Firecrawl key at firecrawl.dev.

API

`GET /`

Health check.

{ "status": "ok", "service": "playwright-selector-gen" }

`POST /`

Headers:

Header	Required	Description
`X-Api-Key`	Yes	Endpoint access key
`Content-Type`	Yes	`application/json`

Body:

Field	Type	Required	Description
`url`	`string`	Yes	Page URL to analyze
`filter`	`string[]`	No	Only return elements matching these tags

Response (200)

{
  "url": "https://www.saucedemo.com",
  "fetchMethod": "firecrawl",
  "totalElements": 8,
  "elements": [
    {
      "index": 0,
      "tag": "input",
      "locators": [
        {
          "method": "getByPlaceholder",
          "playwrightCode": "page.getByPlaceholder('Username')",
          "confidence": "high"
        },
        {
          "method": "getByRole",
          "playwrightCode": "page.getByRole('textbox', { name: 'Username' })",
          "confidence": "high"
        },
        {
          "method": "getByTestId",
          "playwrightCode": "page.getByTestId('username')",
          "confidence": "high"
        }
      ]
    }
  ]
}

Locator Priority

The AI follows the official Playwright recommendation:

Rank	Method	Used For
1	`getByRole()`	ARIA role + accessible name
2	`getByLabel()`	Form fields with associated `<label>`
3	`getByPlaceholder()`	Inputs with placeholder text
4	`getByText()`	Visible text content
5	`getByAltText()`	Images with `alt` attribute
6	`getByTitle()`	Elements with `title` attribute
7	`getByTestId()`	`data-testid` / `data-test` / `data-cy`
8	`locator()` CSS	Short CSS only — absolute last resort
9	`locator()` XPath	Only for truly complex DOM traversal

Deep nested selectors are never generated. Each element gets up to 3 locators ranked best → fallback.

Architecture

POST { url } + X-Api-Key
        │
        ▼
┌─ Fetch HTML ─────────────────┐
│                               │
│  FIRECRAWL_API_KEY set?       │
│  ├─ Yes → Firecrawl API      │
│  │   (headless browser, 3s   │
│  │    wait, JS rendered)     │
│  └─ No  → Raw fetch()       │
│                               │
│  Response includes            │
│  fetchMethod field            │
└──────────┬────────────────────┘
           │
           ▼
┌─ Parse DOM (cheerio) ─────────┐
│  Extract raw elements:        │
│  tag, attributes, outerHTML,  │
│  text content                 │
│                               │
│  0 elements? → 422 error      │
│  with htmlLength + hint       │
└──────────┬────────────────────┘
           │
           ▼
┌─ AI Model (Z.AI) ────────────┐
│  Receives raw DOM data.       │
│  Generates ALL locators.      │
│  Ranks by Playwright priority.│
│  Returns JSON.                │
└──────────┬────────────────────┘
           │
           ▼
     JSON response (200)

Error Responses

Status	Meaning
`401`	Missing or invalid `X-Api-Key` header
`400`	Missing `url` in request body
`405`	Method not allowed (use POST)
`422`	No elements found — includes `fetchMethod`, `htmlLength`, and `hint` for diagnosis
`500`	`ANTHROPIC_API_KEY` not set / AI API error
`502`	Target URL could not be fetched

Why Firecrawl?

Some sites (like saucedemo.com) serve different HTML depending on how they're fetched:

Browser request: Full rendered page with forms, inputs, buttons (4KB+)
Server-side fetch(): Empty shell with just <div id="root"> (1KB)

This happens due to service workers, SSR hydration checks, or bot detection. Firecrawl uses a real headless Chrome browser, so it gets the full rendered page every time.

khawjaahmad

playwright-selector-gen