• Blog
  • Docs
  • Pricing
  • We’re hiring!
Log inSign up
khawjaahmad

khawjaahmad

playwright-selector-gen

MCP: Fetch URL DOM and generate unique Playwright selectors
Public
Like
playwright-selector-gen
Home
Code
2
README.md
H
main.ts
Environment variables
5
Branches
1
Pull requests
Remixes
History
Val Town is a collaborative website to build and scale JavaScript apps.
Deploy APIs, crons, & store data – all from the browser, and deployed in milliseconds.
Sign up now
Code
/
README.md
Code
/
README.md
Search
…
Viewing readonly version of main branch: v14
View latest version
README.md

Playwright Selector Generator

Endpoint: https://playwright.val.run

AI-powered service that fetches any URL's DOM and generates Playwright-native locators. The AI model does all selector generation — zero algorithmic logic in the code. The code only fetches HTML, extracts raw DOM context, and passes it to the model.


Authentication

Every POST request requires the X-Api-Key header.

curl -X POST https://playwright.val.run \ -H "Content-Type: application/json" \ -H "X-Api-Key: your_endpoint_key" \ -d '{"url": "https://www.saucedemo.com"}'

Environment Variables

Env VarRequiredPurpose
X_API_KEYYesProtects the endpoint
ANTHROPIC_API_KEYYesZ.AI / Anthropic API key for the model
ANTHROPIC_URLNoAPI endpoint (default: https://api.z.ai/api/anthropic/v1/messages)
ANTHROPIC_MODELNoModel identifier (default: anthropic/claude-sonnet-4-20250514)
FIRECRAWL_API_KEYRecommendedEnables headless browser fetching via Firecrawl

HTML Fetching

The service fetches page HTML using two strategies:

With FIRECRAWL_API_KEY (recommended): Uses Firecrawl's headless browser API. Renders JavaScript, waits 3 seconds for dynamic content, handles service workers and anti-bot measures. Required for sites like saucedemo.com that serve an empty HTML shell to server-side fetches but render full content in a browser.

Without FIRECRAWL_API_KEY: Falls back to raw fetch(). Works for fully server-rendered pages (Wikipedia, static sites). Fails silently for SPAs — you'll get a 422 with diagnostic info.

The response includes a fetchMethod field ("firecrawl" or "fetch") so you can verify which was used.

Get a Firecrawl key at firecrawl.dev.


API

GET /

Health check.

{ "status": "ok", "service": "playwright-selector-gen" }

POST /

Headers:

HeaderRequiredDescription
X-Api-KeyYesEndpoint access key
Content-TypeYesapplication/json

Body:

FieldTypeRequiredDescription
urlstringYesPage URL to analyze
filterstring[]NoOnly return elements matching these tags

Response (200)

{ "url": "https://www.saucedemo.com", "fetchMethod": "firecrawl", "totalElements": 8, "elements": [ { "index": 0, "tag": "input", "locators": [ { "method": "getByPlaceholder", "playwrightCode": "page.getByPlaceholder('Username')", "confidence": "high" }, { "method": "getByRole", "playwrightCode": "page.getByRole('textbox', { name: 'Username' })", "confidence": "high" }, { "method": "getByTestId", "playwrightCode": "page.getByTestId('username')", "confidence": "high" } ] } ] }

Locator Priority

The AI follows the official Playwright recommendation:

RankMethodUsed For
1getByRole()ARIA role + accessible name
2getByLabel()Form fields with associated <label>
3getByPlaceholder()Inputs with placeholder text
4getByText()Visible text content
5getByAltText()Images with alt attribute
6getByTitle()Elements with title attribute
7getByTestId()data-testid / data-test / data-cy
8locator() CSSShort CSS only — absolute last resort
9locator() XPathOnly for truly complex DOM traversal

Deep nested selectors are never generated. Each element gets up to 3 locators ranked best → fallback.


Architecture

POST { url } + X-Api-Key
        │
        ▼
┌─ Fetch HTML ─────────────────┐
│                               │
│  FIRECRAWL_API_KEY set?       │
│  ├─ Yes → Firecrawl API      │
│  │   (headless browser, 3s   │
│  │    wait, JS rendered)     │
│  └─ No  → Raw fetch()       │
│                               │
│  Response includes            │
│  fetchMethod field            │
└──────────┬────────────────────┘
           │
           ▼
┌─ Parse DOM (cheerio) ─────────┐
│  Extract raw elements:        │
│  tag, attributes, outerHTML,  │
│  text content                 │
│                               │
│  0 elements? → 422 error      │
│  with htmlLength + hint       │
└──────────┬────────────────────┘
           │
           ▼
┌─ AI Model (Z.AI) ────────────┐
│  Receives raw DOM data.       │
│  Generates ALL locators.      │
│  Ranks by Playwright priority.│
│  Returns JSON.                │
└──────────┬────────────────────┘
           │
           ▼
     JSON response (200)

Error Responses

StatusMeaning
401Missing or invalid X-Api-Key header
400Missing url in request body
405Method not allowed (use POST)
422No elements found — includes fetchMethod, htmlLength, and hint for diagnosis
500ANTHROPIC_API_KEY not set / AI API error
502Target URL could not be fetched

Why Firecrawl?

Some sites (like saucedemo.com) serve different HTML depending on how they're fetched:

  • Browser request: Full rendered page with forms, inputs, buttons (4KB+)
  • Server-side fetch(): Empty shell with just <div id="root"> (1KB)

This happens due to service workers, SSR hydration checks, or bot detection. Firecrawl uses a real headless Chrome browser, so it gets the full rendered page every time.

FeaturesVersion controlCode intelligenceCLIMCP
Use cases
TeamsAI agentsSlackGTM
DocsShowcaseTemplatesNewestTrendingAPI examplesNPM packages
PricingNewsletterBlogAboutCareers
We’re hiring!
Brandhi@val.townStatus
X (Twitter)
Discord community
GitHub discussions
YouTube channel
Bluesky
Open Source Pledge
Terms of usePrivacy policyAbuse contact
© 2026 Val Town, Inc.