• Blog
  • Docs
  • Pricing
  • We’re hiring!
Log inSign up
lightweight

lightweight

stainlessTodoSweeper

Remix of lightweight/todoSweeper
Public
Like
stainlessTodoSweeper
Home
Code
10
.claude
1
backend
6
frontend
4
shared
3
.vtignore
AGENTS.md
CLAUDE.md
README.md
deno.json
H
main.http.tsx
Branches
1
Pull requests
Remixes
History
Environment variables
19
Val Town is a collaborative website to build and scale JavaScript apps.
Deploy APIs, crons, & store data – all from the browser, and deployed in milliseconds.
Sign up now
Code
/
Code
/
Search
main.http.tsx
https://lightweight--019b94dd0e2e7454ae2d536264a931a8.web.val.run
README.md

Notion Block Search & Sync System

A Val Town application that searches Notion pages for keywords OR block types (like checkboxes), extracts structured data, stores it in blob storage, and syncs it back to a Notion database with intelligent filtering and validation.

Table of Contents

  • Overview
  • Project Structure
  • MVC Architecture
  • Search Workflow
  • Block Type Handling
  • Validation & Auto-Assignment Rules
  • Endpoints
  • Cron Jobs
  • Environment Variables
  • Search Modes
  • Context Gathering & Assignment Logic
  • Relation Mapping
  • Fuzzy Name Matching
  • Owner Resolution
  • Project Matching

Overview

This system enables automatic extraction and organization of action items from Notion pages:

  1. Search: Scans recent Notion pages for configurable keywords or block types
  2. Extract: Captures block content including mentions and dates
  3. Validate: Filters out blocks that are too short (< 5 words by default)
  4. Enrich: Auto-assigns missing due date; captures author for AI owner resolution
  5. Store: Saves to Val Town blob storage with timestamp tracking and sync metadata
  6. Optimize: Skips already-synced items to reduce API calls by 90%+
  7. Sync: Creates/updates Notion database pages

Project Structure

β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ controllers/         # Business logic
β”‚   β”‚   β”œβ”€β”€ pageController.ts           # Page operations
β”‚   β”‚   β”œβ”€β”€ todoController.ts           # Keyword search logic
β”‚   β”‚   β”œβ”€β”€ todoSaveController.ts       # Blob β†’ Notion sync
β”‚   β”‚   └── todoOrchestrationController.ts  # Batch workflow
β”‚   β”œβ”€β”€ crons/              # Time-based triggers
β”‚   β”‚   β”œβ”€β”€ todoSearch.cron.ts  # Periodic keyword search
β”‚   β”‚   └── todoSync.cron.ts    # Periodic database sync
β”‚   β”œβ”€β”€ routes/             # HTTP handlers
β”‚   β”‚   β”œβ”€β”€ api/            # API endpoints
β”‚   β”‚   β”‚   └── pages.ts    # Recent pages API
β”‚   β”‚   └── tasks/          # Task automation endpoints
β”‚   β”‚       β”œβ”€β”€ todoSearch.ts # Single page search webhook
β”‚   β”‚       β”œβ”€β”€ todoSave.ts # Blob sync webhook
β”‚   β”‚       └── todos.ts    # Batch search & sync
β”‚   β”œβ”€β”€ services/           # External API integrations
β”‚   β”‚   β”œβ”€β”€ notion/         # Notion API wrapper
β”‚   β”‚   β”‚   β”œβ”€β”€ index.ts    # Client initialization
β”‚   β”‚   β”‚   β”œβ”€β”€ pages.ts    # Page operations
β”‚   β”‚   β”‚   β”œβ”€β”€ databases.ts # Database operations
β”‚   β”‚   β”‚   β”œβ”€β”€ blocks.ts   # Block operations
β”‚   β”‚   β”‚   └── search.ts   # Search operations
β”‚   β”‚   β”œβ”€β”€ aiService.ts    # OpenAI for fuzzy matching
β”‚   β”‚   └── blobService.ts  # Val Town blob storage
β”‚   └── utils/              # Utility functions
β”‚       β”œβ”€β”€ notionUtils.ts  # Block transformation
β”‚       β”œβ”€β”€ blobUtils.ts    # Blob key parsing
β”‚       └── emojiUtils.ts   # Emoji extraction
β”œβ”€β”€ frontend/               # React frontend
β”œβ”€β”€ shared/                 # Shared types and utilities
β”‚   β”œβ”€β”€ types.ts            # TypeScript interfaces
β”‚   └── utils.ts            # Shared utility functions
β”œβ”€β”€ main.http.tsx           # Application entry point (Hono)
β”œβ”€β”€ CLAUDE.md               # Development guidelines
└── AGENTS.md               # Val Town platform guidelines

MVC Architecture

This application follows a strict 3-layer MVC architecture with clear separation of concerns:

Request β†’ Route β†’ Controller β†’ Service β†’ External API
                      ↓
Response ← Format ← Standard Response ← Result

Layer 1: Routes (backend/routes/)

Responsibility: HTTP handling only

  • Extract request parameters (query, body, headers)
  • Call controller functions
  • Format responses with appropriate HTTP status codes
  • Never contain business logic
// Example: backend/routes/tasks/todos.ts app.post("/", async (c) => { const keyword = c.req.query("keyword") || undefined; const result = await todoOrchestrationController.processBatchTodos( hours, keyword ); return c.json(result, 200); });

Layer 2: Controllers (backend/controllers/)

Responsibility: Business logic and orchestration

  • Validate input data
  • Orchestrate multiple service calls
  • Transform and filter data
  • Return standardized response format: {success, data, error, details?}
  • Never make direct HTTP calls to external APIs
// Example: backend/controllers/todoController.ts export async function processTodoSearch(pageId: string, keyword: string = 'todo') { // Validation if (!pageId) return { success: false, error: "Invalid pageId", ... }; // Call service layer const blocks = await notionService.getPageBlocksRecursive(pageId); // Business logic const matches = blocks.filter(block => searchBlockForKeyword(block, keyword)); return { success: true, data: matches, error: null }; }

Layer 3: Services (backend/services/)

Responsibility: External API calls only

  • Make HTTP requests to external APIs
  • Handle API authentication
  • Parse and normalize API responses
  • Return structured results: {success, data, error}
  • Never contain business logic
// Example: backend/services/notion/pages.ts export async function getPageBlocksRecursive(blockId: string) { const response = await notion.blocks.children.list({ block_id: blockId }); return response.results; }

Golden Rule: Never skip layers! Routes call controllers, controllers call services. This ensures testability, maintainability, and clear separation of concerns.

Search Workflow

The system follows a three-stage pipeline: Notion β†’ Blob Storage β†’ Notion Database

This workflow supports two search modes (keyword or block type). The pipeline remains the same regardless of mode.

Stage 1: Search & Extract (Notion β†’ Blobs)

Flow:

  1. Get recent pages from Notion (configurable time window)
  2. For each page, recursively fetch all blocks
  3. Search blocks for keyword matches
  4. Extract structured data from matching blocks
  5. Validate word count: Skip blocks below MIN_BLOCK_WORDS (default: 5)
  6. Auto-assign due date: Add default due date if missing; capture block author
  7. Save enriched blocks to blob storage with timestamp

Validation & Enrichment (happens here, not during sync):

  • ❌ Word count < MIN_BLOCK_WORDS? β†’ Skip (too short to be meaningful)
  • ⚠️ Missing date_mention? β†’ Auto-assign based on DEFAULT_DUE_DATE setting (default: today)
  • βœ… Only skips if: too short (below MIN_BLOCK_WORDS)
  • Note: Owner is determined by AI during sync, not during search
  • Result: All blobs in storage are meaningful with due date; owner resolved during sync

Endpoints:

  • POST /tasks/todo/search - Single page search (webhook-triggered)
  • POST /tasks/todos?hours=24 - Batch search across recent pages

Keywords Configuration:

  • Set via SEARCH_KEYWORDS environment variable (comma-separated)
  • Example: SEARCH_KEYWORDS=todo,zinger,bit,πŸ˜€
  • Defaults to todo if not set
  • All keywords searched in single pass through blocks (efficient)

Keyword Matching Logic:

  • Text keywords (e.g., "todo", "bit", "steel"):
    • Case-insensitive
    • Word boundary matching (finds "todo" but not "todoist")
    • Uses regex: /\btodo\b/i
  • Emojis (e.g., "πŸ˜€", "πŸŽ‰"):
    • Exact match
    • Case-sensitivity N/A
  • Multi-keyword: Block saved if it matches ANY keyword

Block Extraction:

When a keyword is found, the system extracts and transforms the block into a reduced format:

{ todo_string: "Buy groceries for @John due October 30, 2025", block_id: "abc-123-def-456", block_url: "https://www.notion.so/abc123def456", last_edited_time: "2025-10-29T12:00:00.000Z", people_mentions: [{ id: "user-123", name: "John", email: "john@example.com" }], date_mentions: ["2025-10-30"], link_mentions: [{ text: "Project", url: "/page-id" }], sync_metadata: { synced: false, // Needs sync to Notion database target_page_id: undefined // Will be set after first sync (ID of page in todos database) } }

Transformation Details:

  • Dates: Formatted to human-readable (e.g., "October 30, 2025 at 3:00 PM EDT")
  • Original dates preserved: ISO format kept in date_mentions array for Notion API
  • Block URL: Clickable link to original block location
  • Emojis: Extracted for use as page icons

Stage 2: Blob Storage

Storage Format:

  • Key pattern: {projectName}--{category}--{blockId}
  • Example: demo--todo--abc-123-def-456
  • Content: JSON of reduced block structure with sync metadata

Blob Structure:

{ todo_string: "...", block_id: "...", page_url: "...", // Source page URL parent_id: "..." | null, // Parent block ID (for project matching) preceding_heading: "..." | null, // Closest h1/h2/h3 before block (for fuzzy matching) // ... other properties sync_metadata: { synced: boolean, // true = synced to Notion, false = needs sync target_page_id?: string // Cached ID of page in todos database (optimization) } }

Update Logic:

  • Compare last_edited_time of existing blob vs new block
  • If unchanged: Skip save (preserves synced: true status)
  • If changed: Save with synced: false (triggers re-sync)
  • Preserve cached target_page_id across updates
  • Prevents data loss from out-of-order processing

Stage 3: Sync to Notion Database (Blobs β†’ Notion)

Flow:

  1. List all blobs in "todo" category
  2. For each blob, read reduced block data
  3. Optimization: Skip if synced: true (0 API calls)
  4. Optimization: Use cached target_page_id if available (1 API call - update only)
  5. If no cached ID: Query database for existing page by Block ID
  6. Create new page OR update existing page
  7. Mark blob as synced: true and cache page ID

Note: No validation happens during sync - all blobs are guaranteed valid because validation occurs during the search phase (Stage 1).

Endpoints:

  • POST /tasks/todo/save - Sync all blobs to database (webhook-triggered)
  • POST /tasks/todos - Batch workflow (search + sync in one call)

Property Mappings (Blob β†’ Notion Database):

todo_string        β†’ Name (title)
block_id           β†’ Block ID (rich_text)
block_url          β†’ Block URL (url)
page_url           β†’ Page URL (url) - source page where todo was found
last_edited_time   β†’ Todo last edited time (date)
people_mentions[0] β†’ Owner (people)
people_mentions[1..] β†’ Other people (people)
date_mentions[0]   β†’ Due date (date)
link_mentions      β†’ Links (rich_text, bullet list)
matched projects   β†’ Projects db (relation) - see Project Matching
emoji (if found)   β†’ Page icon

Note: Status is not set by todoSweeper - configure a default in your Notion database properties.

Sync Optimization:

The system uses sync metadata to dramatically reduce Notion API calls:

On first sync:

  • Blob has synced: false, no target_page_id
  • Query database β†’ create or update β†’ cache page ID
  • Mark synced: true
  • API calls: 1 query + 1 create/update = 2 calls

On subsequent syncs (no changes):

  • Blob has synced: true
  • Skip immediately
  • API calls: 0 calls (100% reduction)

On subsequent syncs (block changed):

  • Blob saved with synced: false (block edited in Notion)
  • Has cached target_page_id from previous sync
  • Update directly without query
  • Mark synced: true
  • API calls: 1 update (50% reduction)

Performance impact:

  • Before optimization: 100 blobs = 100 queries + 50 updates = 150 API calls
  • After optimization: 90 synced + 10 changed = 0 + 10 updates = 10 API calls (93% reduction)

Block Type Handling

The search uses recursive block fetching to traverse the entire page hierarchy, including nested content.

Recursive Fetching

How it works:

function getPageBlocksRecursive(blockId, containerFilter?) { 1. Fetch immediate children of blockId 2. For each child: - Add child to results - If child.has_children === true: - If containerFilter provided: only recurse if block type is in filter - Otherwise: recurse into all children - Add to results 3. Return flattened array of all blocks }

What this means:

  • βœ… Finds blocks nested inside toggles
  • βœ… Finds blocks nested inside columns
  • βœ… Finds blocks nested inside lists
  • βœ… Finds blocks nested N levels deep

Block Type Mode Optimization

When using block type mode (SEARCH_BLOCK_TYPE=to_do), the system optimizes recursive fetching by only traversing into container blocks that can hold to_do children:

Container blocks (recursed into):

  • to_do - to_do blocks can nest inside other to_do blocks
  • toggle - common pattern for organizing todos
  • column_list / column - layout containers
  • synced_block - can contain any block type
  • callout - can contain nested content
  • quote - can contain nested blocks
  • bulleted_list_item / numbered_list_item - can have nested content
  • template - can contain any block type

Non-container blocks (skipped):

  • paragraph, heading_1/2/3, code, equation - cannot have to_do children
  • image, video, file, pdf, audio, embed, bookmark - media blocks

Performance impact: Significantly reduces API calls by skipping recursion into blocks that cannot contain to_do items. Keyword mode still traverses all blocks (no filter applied).

Included Block Types

These block types are searched for keywords:

Block TypeHas rich_text?Notes
paragraphβœ…Standard text blocks
heading_1, heading_2, heading_3βœ…All heading levels
bulleted_list_itemβœ…Bullet lists
numbered_list_itemβœ…Numbered lists
to_doβœ…Checkbox items
toggleβœ…Collapsible toggles
quoteβœ…Quote blocks
calloutβœ…Callout/alert blocks
codeβœ…Code blocks (captions only)
columnN/AContainer - children are searched
column_listN/AContainer - children are searched

Column Behavior:

  • Column blocks themselves have no searchable text
  • But their children (paragraphs, lists, etc.) ARE searched
  • Example: A todo in a column will be found

Excluded Block Types

These block types are explicitly skipped:

Block TypeReason
unsupportedNot supported by Notion API
buttonAction buttons, not content
tableContainer block, no text content
table_rowCells aren't individual blocks; can't be saved to blob
child_pagePage title not in rich_text format
child_databaseDatabase title not in rich_text format
dividerNo text content
table_of_contentsNo text content
breadcrumbNo text content
image, file, video, pdfMedia blocks (captions could be added later)
bookmark, embedExternal content (could be added later)

Why tables are excluded:

  • Table content lives in table_row.cells[][] (array of arrays)
  • Cells contain rich_text but aren't individual blocks
  • Can't be saved to blob storage as standalone blocks
  • Can't create Notion pages from cell content

Validation & Auto-Assignment Rules

Matched blocks are validated for minimum length, then enriched with auto-assigned due dates before being saved to blob storage. Validation ensures quality; owner is determined by AI during sync.

Validation & Enrichment for Blob Storage

Matched blocks go through validation and enrichment before being saved:

1. Word Count Validation (REQUIRED):

  • βœ… Block must have at least MIN_BLOCK_WORDS words (default: 5)
  • ❌ Blocks with fewer words are skipped - too short to be meaningful todos
  • Counts all words including mentions and dates (simple whitespace split)
  • Example: "Buy groceries for @John tomorrow" = 5 words (passes)
  • Example: "todo" = 1 word (skipped)

2. Due Date (CONDITIONAL + AUTO-ASSIGNED):

  • Date mentions are only used as due dates if preceded by "due" or "by"
  • Example: "finish report by @October 30" β†’ October 30 becomes Due date
  • Example: "meeting @October 30 about Q4" β†’ date is content only, not Due date
  • First qualifying date becomes "Due date"
  • AUTO-ASSIGNED: If no qualifying date found, uses DEFAULT_DUE_DATE env var (defaults to "today")

3. Owner Assignment (AI-DETERMINED during sync):

  • Owner is determined by AI during sync, not during search
  • AI uses context with priority: (1) heading, (2) matched contacts, (3) @mentions
  • Block creator is captured as author for potential use in owner resolution
  • Owner can be null if AI cannot determine from context
  • @mentions populate "Other people" property

Automatic Due Date Assignment:

  • Triggered when no date mention is preceded by "due" or "by"
  • Even if the block contains date mentions, they're only treated as content unless qualified
  • Configurable: Set via DEFAULT_DUE_DATE environment variable
  • Options: today (default), tomorrow, one_week, end_of_week, next_business_day
  • Rationale: Blocks without explicit due dates still need deadlines; "today" is a sensible default
  • Date is stored in same ISO format as explicit dates (e.g., "2025-10-31")

Owner Resolution (during sync):

  • Owner is determined by AI during sync using generateSummary() + resolveOwner()
  • AI analyzes context to identify the responsible person:
    1. Heading: If preceding heading contains a person's name β†’ that person is owner
    2. Contact: If matched contacts exist β†’ first contact is owner
    3. Mention: If @mentions exist β†’ first @mention is owner
  • Block creator is stored as author and may be used if owner source is "heading"
  • If AI cannot determine owner, the Owner field is left empty
  • This allows intelligent assignment based on context rather than assuming creator = owner

Result: All matched blocks are saved - no blocks are skipped due to missing people. Owner is resolved during sync.

When Validation Happens

During Search (Stage 1) - todoController.ts:

  • After keyword match is found
  • After block is transformed to reduced format
  • Before saving to blob storage

Not During Sync (Stage 3) - All blobs are guaranteed valid, no checking needed

Validation Logic

// From todoController.ts (search phase) // Step 1: Check minimum word count const minWords = getMinBlockWords(); const wordCount = countWords(reducedBlock.todo_string); if (wordCount < minWords) { console.log( `β—‹ Skipped: block too short (${wordCount} words, minimum: ${minWords})` ); continue; // Don't save to blob } // Step 2: Auto-assign due date if no date mentioned if (!reducedBlock.date_mentions || reducedBlock.date_mentions.length === 0) { const setting = getDefaultDueDateSetting(); const calculatedDate = calculateDueDate(setting); reducedBlock.date_mentions = [calculatedDate]; } // Step 3: Capture block creator as author (for AI owner resolution during sync) const creator = getBlockCreator(block); (reducedBlock as any).author = creator; // May be null if no creator // All validated blocks reach this point and get saved // Owner will be determined by AI during sync phase await blobService.saveBlockToBlob("todo", block.id, blobData);

Examples

Valid - @mention + qualified date (8 words):

"Buy groceries for @John due October 30, 2025"
βœ… Word count: 8 (passes minimum of 5)
βœ… Has @mention (@John β†’ Other people)
βœ… Has qualified date ("due" precedes October 30 β†’ Due date)
β†’ Saved to blob storage
β†’ Synced: AI determines owner; due Oct 30

Valid - Qualified date, no @mention (6 words):

"Buy groceries by October 30, 2025"
βœ… Word count: 6 (passes minimum of 5)
βœ… Has qualified date ("by" precedes October 30 β†’ Due date)
β†’ Saved to blob storage (author captured for AI resolution)
β†’ Synced: AI determines owner; due Oct 30

Valid - Unqualified date (auto-assigned) (7 words):

"Meeting with @John October 30 about Q4"
βœ… Word count: 7 (passes minimum of 5)
βœ… Has @mention (@John β†’ Other people)
⚠️  Date not preceded by "due"/"by" β†’ treated as content only
⚠️  No qualifying date β†’ Auto-assigned based on DEFAULT_DUE_DATE
β†’ Saved to blob storage
β†’ Synced: AI determines owner; due date auto-assigned

Valid - No date, no @mention (5 words):

"Buy groceries at the store"
βœ… Word count: 5 (passes minimum of 5)
⚠️  No qualifying date β†’ Auto-assigned based on DEFAULT_DUE_DATE
β†’ Saved to blob storage (author captured for AI resolution)
β†’ Synced: AI determines owner from context or leaves empty

Invalid - Too short (2 words):

"Buy groceries"
❌ Word count: 2 (below minimum of 5)
β†’ NOT saved to blob storage (skipped - too short)

Invalid - Too short (1 word):

"todo"
❌ Word count: 1 (below minimum of 5)
β†’ NOT saved to blob storage (skipped - too short)

Invalid - Too short with emojis (2 words):

"πŸ‹ πŸŽ‰"
❌ Word count: 2 (below minimum of 5)
β†’ NOT saved to blob storage (skipped - too short)

Valid - No mentions but meaningful (6 words):

"Buy groceries at the store tomorrow"
βœ… Word count: 6 (passes minimum of 5)
⚠️  No date mention β†’ Auto-assigned based on DEFAULT_DUE_DATE
β†’ Saved to blob storage (author captured if available)
β†’ Synced: AI determines owner from context or leaves empty

Note: With word count validation, most meaningful blocks are saved. Blocks are only skipped if too short (below MIN_BLOCK_WORDS, default: 5).

Sync Summary

After syncing, the controller reports:

  • Total blobs: All blobs in storage (all guaranteed valid with due date)
  • Pages created: New pages added to database
  • Pages updated: Existing pages updated
  • Pages skipped: Blobs already synced (synced: true)
  • Pages failed: Errors during create/update

Note: All blobs have due dates assigned; owner is determined by AI during sync.

Endpoints

API Endpoints

GET /api/pages/recent?hours=24

  • Get pages edited in last N hours
  • Filters out archived pages and pages in TODOS_DB_ID database
  • Returns simplified page objects with parent information

Response:

{ "pages": [ { "id": "page-id", "object": "page", "title": "My Page", "url": "https://notion.so/...", "last_edited_time": "2025-10-29T12:00:00.000Z", "parent": { "type": "page_id", "id": "parent-id" } } ], "count": 1, "timeRange": "24 hours" }

Task Endpoints

POST /tasks/todo/search

  • Search single page for keywords (webhook-triggered)
  • Keywords from SEARCH_KEYWORDS env var (comma-separated)
  • Extracts and saves matching blocks to blobs
  • Body: { "page_id": "abc-123" }

POST /tasks/todo/save

  • Sync all blobs to Notion database
  • Validates and creates/updates pages
  • No request body needed

POST /tasks/todos?hours=24

  • Batch workflow: Search recent pages + sync to database
  • Keywords from SEARCH_KEYWORDS env var
  • Combines search and save in one call
  • Use for manual triggers or cron jobs

Response:

{ "success": true, "pagesSearched": 5, "totalTodosFound": 12, "searchResults": [ { "pageId": "abc-123", "pageTitle": "My Page", "success": true, "blocksFound": 3, "blockIds": ["block-1", "block-2", "block-3"] } ], "saveResult": { "totalBlobs": 12, "pagesCreated": 5, "pagesUpdated": 3, "pagesSkipped": 4, "pagesFailed": 0 } }

Cron Jobs

The system includes two separate cron jobs for automated workflow execution. Crons are time-based triggers that run independently of HTTP requests.

Architecture

Crons live in backend/crons/ and follow the same MVC pattern as HTTP routes:

Cron Trigger β†’ Controller β†’ Service β†’ External API

Key differences from HTTP routes:

  • Triggered by time intervals (not HTTP requests)
  • No request/response cycle
  • Results logged to console only
  • Use .cron.tsx extension for Val Town

Cron 1: Todo Search (todoSearch.cron.ts)

Purpose: Search recent pages for keywords/block types and save matches to blob storage

Workflow:

  1. Get recent pages from Notion (last 15 minutes)
  2. Search each page for configured keywords or block types
  3. Save matching blocks to Val Town blob storage
  4. Does NOT sync to Notion database

Configuration:

  • Lookback window: 15 minutes (optimized for frequent runs)
  • Keywords/Block type: From SEARCH_KEYWORDS or SEARCH_BLOCK_TYPE env var
  • Recommended schedule: Every 1 minute
    • 15 minute lookback provides buffer for missed runs
    • Frequent runs catch changes quickly

Output:

=== Cron: Todo Search Started ===
Timestamp: 2025-10-29T12:00:00.000Z

Cron: Search complete - Found 12 todos in 5 pages
Pages with matches:
  - Project Planning: 3 match(es)
  - Meeting Notes: 5 match(es)
  - Weekly Review: 4 match(es)

=== Cron: Todo Search Complete ===

Cron 2: Todo Sync (todoSync.cron.ts)

Purpose: Sync validated todo blobs to Notion database

Workflow:

  1. Read all todo blobs from Val Town blob storage
  2. Validate each blob (requires person mention + date mention)
  3. Query Notion database for existing pages by Block ID
  4. Create new pages or update existing pages (timestamp-based)

Configuration:

  • No parameters: Processes all blobs in storage
  • Recommended schedule: Every 8-12 hours
    • Less frequent than search cron
    • Allows time for blob accumulation
    • Reduces Notion API calls

Output:

=== Cron: Todo Sync Started ===
Timestamp: 2025-10-29T14:00:00.000Z

Cron: Sync complete
Summary:
  Total blobs processed: 12
  Pages created: 5
  Pages updated: 3
  Pages skipped: 4
  Pages failed: 0

=== Cron: Todo Sync Complete ===

Why Two Separate Crons?

Operational flexibility:

  • Search cron runs frequently to capture changes quickly
  • Sync cron runs less frequently to batch database updates
  • Reduces Notion API rate limit concerns
  • Allows manual triggering of sync independently

Fault isolation:

  • Search failures don't block syncing existing blobs
  • Sync failures don't block new searches
  • Each cron can be debugged independently

Cost optimization:

  • Blob storage is cheap and fast
  • Notion API calls are rate-limited
  • Separate crons allow different schedules for different costs

Setting Up Crons in Val Town

  1. Navigate to Val Town UI
  2. Create new cron vals:
    • todoSearch.cron.tsx - Copy content from backend/crons/todoSearch.cron.ts
    • todoSync.cron.tsx - Copy content from backend/crons/todoSync.cron.ts
  3. Set schedules:
    • todoSearch.cron.tsx: Every 1 minute (* * * * *)
    • todoSync.cron.tsx: Every 1 minute (* * * * *)
  4. Monitor logs: Check Val Town console for cron execution results

Note: Val Town cron jobs must be separate vals (not files in this project). The files in backend/crons/ serve as templates to copy into Val Town cron vals.

Environment Variables

Required environment variables (set in Val Town):

  • NOTION_API_KEY - Notion integration token (required)

    • Get from: https://www.notion.so/my-integrations
    • Required for all Notion API calls
  • TODOS_DB_ID - Notion database ID for todo sync (required)

    • The database where keyword matches are synced
    • Format: abc123def456... (32-character ID without hyphens)
  • PROJECTS_DB_ID - DEPRECATED (optional, still works)

    • No longer needed - relations are auto-discovered from the todos database schema
    • Configure project matching in notion.config.ts instead (see Project Matching)
    • If set, a deprecation warning will be logged
  • SEARCH_KEYWORDS - Keywords to search for (optional, keyword mode)

    • Comma-separated list of keywords/phrases
    • Example: todo,zinger,bit or todo,πŸ˜€,πŸŽ‰
    • Defaults to todo if not set
    • All blocks matching ANY keyword will be saved to blob storage
    • Efficient: Searches all keywords in a single pass through blocks
  • SEARCH_BLOCK_TYPE - Block type to search for (optional, block type mode)

    • Alternative to keyword search - searches by Notion block type
    • Example: to_do (searches all Notion checkbox blocks)
    • Defaults to to_do if set with empty value
    • Takes precedence over SEARCH_KEYWORDS if both are set
    • Common values: to_do, paragraph, bulleted_list_item, numbered_list_item
    • Still requires people_mentions + date_mentions validation
    • Useful for: "Check a box, add @person and date = instant todo"
  • MIN_BLOCK_WORDS - Minimum word count for blocks to be saved (optional)

    • Blocks with fewer words are skipped (too short to be meaningful todos)
    • Defaults to 5 if not set
    • Word counting:
      • Counts all words including mentions and dates (simple whitespace split)
      • Hyphenated words count as 1 (e.g., "buy-now" = 1 word)
      • Emojis count as words (e.g., "πŸ‹ πŸŽ‰" = 2 words)
    • Default of 5 accounts for: ~1 word for mention + ~2-3 words for date + ~2-3 words for action
    • Example: "Buy groceries for @John tomorrow" = 5 words (minimum)
  • DEFAULT_DUE_DATE - Default due date for blocks without date mentions (optional)

    • Used when a block matches search criteria but has no date
    • Supported values: today, tomorrow, one_week, end_of_week, next_business_day
    • Defaults to today if not set
    • Examples:
      • today - Due today (default)
      • tomorrow - Due tomorrow
      • one_week - Due 7 days from today
      • end_of_week - Due next Friday (end of work week)
      • next_business_day - Due next weekday (skips weekends)
  • BLOCK_STABILITY_MINUTES - Minimum age (in minutes) before blocks are saved to blob storage (optional)

    • Only applies to cron-triggered searches (not webhook-triggered)
    • Prevents syncing blocks that are actively being edited
    • Defaults to 0 if not set (no delay - blocks saved immediately)
    • Set to a positive number to add a stability delay (e.g., 2 for 2 minutes)
    • Examples:
      • Not set or BLOCK_STABILITY_MINUTES=0 - All blocks saved immediately (default)
      • BLOCK_STABILITY_MINUTES=2 - Block edited 1 minute ago will be skipped, 3 minutes ago will be saved
      • BLOCK_STABILITY_MINUTES=5 - Block edited 4 minutes ago will be skipped, 6 minutes ago will be saved
    • Webhook/button triggers always bypass this delay regardless of setting (immediate sync on user action)
    • Use case: Set a delay if you frequently edit todos and want cron to wait for "final" versions
  • RECENT_PAGES_LOOKBACK_HOURS - Default lookback window for searching recent pages (optional)

    • System-wide setting that applies to all triggers: cron jobs, frontend dashboard, manual API calls
    • Defaults to 24 hours if not set
    • Must be a positive integer
    • Can be overridden per-request with ?hours=X query parameter
    • Examples:
      • Not set or RECENT_PAGES_LOOKBACK_HOURS=24 - Search last 24 hours (default)
      • RECENT_PAGES_LOOKBACK_HOURS=48 - Search last 48 hours (2 days)
      • RECENT_PAGES_LOOKBACK_HOURS=168 - Search last 168 hours (1 week)
    • Use case: Match to your cron schedule or desired dashboard timeframe
      • Cron every 4 hours β†’ Set to 6-8 hours (buffer for overlap/delays)
      • Cron every 24 hours β†’ Set to 24-48 hours
      • No cron β†’ Set to desired dashboard timeframe
  • NOTION_WEBHOOK_SECRET - API key for protecting webhooks and API endpoints (recommended)

    • Required for production use - protects all /tasks/* and /api/* endpoints (except /api/health)
    • Prevents unauthorized access to your Notion data and webhook triggers
    • Set to any secure random string (e.g., generated password or UUID)
    • Notion webhook configuration: Add custom header X-API-KEY with this value
    • API requests: Include header X-API-KEY: your-secret-value
    • If not set, authentication is disabled (development mode only)
    • Example: NOTION_WEBHOOK_SECRET=abc123xyz789...

    Security note: Without this, anyone can:

    • Trigger your webhooks (causing unnecessary processing)
    • Access recent pages via /api/pages/recent (potential data leak)
    • View page IDs from /api/health and use them to query other endpoints

    Public endpoint exception: /api/health remains public (needed for frontend dashboard)

  • API_KEY - Legacy API key (deprecated, use NOTION_WEBHOOK_SECRET instead)

    • Kept for backwards compatibility
    • Use NOTION_WEBHOOK_SECRET for new deployments
  • CRONS_DISABLED - Disable all cron jobs without changing Val Town UI (optional)

    • Set to true to disable crons (they will exit early with a log message)
    • Not set or false = crons run normally (default)
    • Useful for debugging or when using webhooks exclusively
    • Note: Crons still execute in Val Town (uses compute), they just exit immediately

Search Modes

This system supports two mutually exclusive search modes. Choose the mode that best fits your workflow.

Mode 1: Keyword Search (default)

When to use: You want to search for specific text in blocks (e.g., "todo", "zinger", emojis).

Configuration:

SEARCH_KEYWORDS=todo,zinger,πŸ‹

How it works:

  • Searches block text content for keywords
  • Matches text keywords with word boundaries (case-insensitive)
  • Matches emojis with exact match
  • Example: Block containing "Buy groceries todo @John October 30"

Use case: Flexible text-based search across any block type that contains your keywords.

Mode 2: Block Type Search

When to use: You want to use Notion's native block types (especially checkboxes) as task markers.

Configuration:

SEARCH_BLOCK_TYPE=to_do

How it works:

  • Searches by Notion block type (not text content)
  • Matches to_do blocks (Notion checkboxes)
  • Works with both checked and unchecked checkboxes
  • No keyword required in text
  • Example: Any checkbox block with @John and October 30

Use case:

  1. Create a checkbox in Notion (makes it a to_do block)
  2. Add @person mention
  3. Add date
  4. Done! Automatically syncs to database (no need to type "todo")

Other block types: You can also search for paragraph, bulleted_list_item, numbered_list_item, etc.

Mode Priority

If both env vars are set: SEARCH_BLOCK_TYPE takes precedence

  • System will use block type mode
  • SEARCH_KEYWORDS will be ignored
  • A warning will be logged

If neither is set: Defaults to keyword mode with keyword "todo"

Enrichment (applies to both modes)

Regardless of search mode, all matched blocks are enriched with:

  • ⚠️ date_mention - If missing, auto-assigned based on DEFAULT_DUE_DATE (default: today)
  • author - Block creator is captured for potential use in AI owner resolution

All matched blocks are saved to blob storage. Owner is determined during sync using a deterministic cascade (see Context Gathering below).

Context Gathering & Assignment Logic

This section explains how context is gathered from todos and used to populate database fields. The goal is to make "magical" assignments predictable and debuggable.

Code reference: backend/utils/contextUtils.ts contains the core logic and detailed documentation.

Two-Stage Processing

StageWhenWhat HappensFiles
SearchWhen pages are scannedExtract block data + surrounding contexttodoController.ts
SyncWhen blobs sync to DBResolve relations, owner, generate summarytodoSaveController.ts

Context Sources

1. Block-Level (Inherent Metadata)

Extracted directly from the Notion block:

ContextDescriptionExample
todo_stringBlock text content"Email client about proposal"
people_mentions@mentions with user IDs@Jane Smith
date_mentionsDates after "due"/"by""due Friday" β†’ 2024-01-12
link_mentionsPage links[[Project Alpha]]
authorBlock's created_byWho wrote the todo

2. Surrounding Context (Captured at Search Time)

Context from the block's environment (no extra API calls):

ContextDescriptionHow Captured
preceding_headingLast h1/h2/h3 before this blockTracked while scanning page
parent_idParent block IDFrom block's parent field
source_page_database_idWhich database the page is inFrom page's parent field
page_urlURL of containing pageFrom page object

3. Parent Traversal (At Sync Time)

When block-level context is insufficient, we crawl up the block tree:

  • What: Check parent blocks for @mentions, page links, names
  • Depth: Up to 5 levels of parents
  • Used for: Project matching, fuzzy contact matching, owner candidates
  • API calls: 1 per parent level traversed

4. AI Involvement

AI is used sparingly and only when deterministic methods fail:

Use CaseWhen AI is CalledCan Be Rejected?
Project matchingNo @mention or contextual matchNo (semantic similarity)
DisambiguationMultiple candidates matchNo (picks best)
Summary generationAlways (polishes text)N/A
Owner suggestionAlways (part of summary)Yes - validated against actual context

Owner Resolution Flow

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Step 1: Does heading match someone in owner database?           β”‚
β”‚         (e.g., "### Alex" matches "Alex Johnson" in Contacts)   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ YES β†’ Use heading match (deterministic, most reliable)          β”‚
β”‚ NO  β†’ Continue to Step 2                                        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Step 2: Are there ANY owner candidates?                         β”‚
β”‚         - Heading matched? (from Step 1)                        β”‚
β”‚         - Contacts matched via fuzzy matching?                  β”‚
β”‚         - @mentions in block?                                   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ NO  β†’ Check parent blocks for @mentions or headings             β”‚
β”‚ YES β†’ Continue to Step 3                                        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Step 3: Still no candidates after checking parents?             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ NO candidates anywhere β†’ Owner = null (prevents hallucination)  β”‚
β”‚ Candidates exist β†’ Continue to Step 4                           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Step 4: AI generates summary and suggests owner                 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ AI claims "heading" source β†’ Verify name appears in heading     β”‚
β”‚ AI claims "contact" source β†’ Verify name is in matched contacts β”‚
β”‚ AI claims "mention" source β†’ Verify name is in @mentions        β”‚
β”‚                                                                 β”‚
β”‚ Validation FAILS β†’ Reject AI owner, owner = null                β”‚
β”‚ Validation PASSES β†’ Resolve name to user/page ID                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Why This Design?

  1. Deterministic when possible: Heading matches are 100% reliable - no AI guessing
  2. AI as fallback, not authority: AI suggests, we validate against actual context
  3. No hallucination: If no real candidates exist, owner stays null
  4. Debuggable: Clear priority order makes it easy to trace why a field was assigned

Example Scenarios

Scenario 1: Heading determines owner

### Alex                          ← Heading matches "Alex Johnson" in owner DB
[] Connect with Taylor about docs ← "Taylor" appears but is task TARGET, not owner

Result: Owner = Alex Johnson (from heading, deterministic)

Scenario 2: @mention determines owner

[] @Jane should review the PR     ← @mention with user ID

Result: Owner = Jane (from @mention)

Scenario 3: No owner candidates

[] Update the documentation       ← No heading, no contacts, no @mentions

Result: Owner = null (prevents AI from inventing someone)

Scenario 4: Parent block provides context

Parent block: @Team Lead please assign these:
  └─ [] Task one                  ← No direct owner info
     └─ [] Task two

Result: Owner = Team Lead (from parent @mention)

Relation Mapping

The system automatically discovers all relation properties in your Todos database and maps @mentions to the appropriate relations.

How it works:

  1. On each sync, the system reads your Todos database schema from the Notion API
  2. All relation properties are discovered automatically (no configuration needed)
  3. For each relation, the system applies mention and contextual mapping (Strategies 1-2)
  4. The relation configured in TODOS_PROPERTIES.projects also gets AI matching (Strategies 3-4)

Example: If your Todos database has relations to "Projects db", "Clients", and "Tags":

  • All three are auto-discovered
  • @mentions in todos are mapped to the correct relation based on the mentioned page's parent database
  • Only "Projects db" (if configured) gets AI fuzzy matching

Fuzzy Name Matching

For configured relations, the system can automatically link todos to pages based on fuzzy name matching. This works when the todo text or parent blocks contain a name (plain text, not @mentions) that matches a page in the relation's target database.

How It Works

  1. Configure relations for fuzzy matching in notion.config.ts:

    export const TODOS_PROPERTIES = { // ... other properties fuzzyMatch: ["Contacts", "Companies"], // Relations to enable fuzzy matching };
  2. During sync, for each configured relation:

    • Load all page names from the relation's target database
    • Check the todo text AND preceding heading for name matches (case-insensitive substring)
    • If no match in todo or heading, check parent blocks (up to 5 levels)
    • Single match β†’ link automatically
    • Multiple matches β†’ AI disambiguation

Example

### Bryce                        ← Preceding heading contains "Bryce"
[] do something good and fine    ← Todo text has no name
β†’ Matches "Bryce Taylor" in Contacts via preceding heading context

How preceding headings work:

  • When searching pages, the system tracks the most recent heading (h1/h2/h3) seen
  • Each todo block stores its preceding_heading in blob storage
  • During fuzzy matching, heading text is combined with todo text for name search
  • Zero additional API calls - heading is captured when blocks are already being read

Priority Order

Fuzzy name matching runs after explicit @mentions and before AI project matching:

  1. Strategy 1: Explicit @mentions - Direct page links in todo text (WINS)
  2. Strategy 2: Contextual mapping - Source page is in target database
  3. Strategy 3: Fuzzy name matching - Name found in todo or parent blocks
  4. Strategy 4-5: AI matching - Projects relation only (see Project Matching)

If a relation is already matched by @mention, fuzzy matching is skipped for that relation.

AI Disambiguation

When multiple names match (e.g., "John Smith" and "John Doe" both contain "john"), the system uses AI (gpt-4o-mini) to pick the best match based on the todo context.

Graceful Degradation

If a configured relation doesn't exist in your database schema:

  • During sync: The relation is silently skipped (sync continues normally)
  • In health check: Shows a warning so you can update your config

Health Check Status

The /api/health endpoint shows fuzzy match configuration status:

{ "fuzzy_match": { "configured": ["Contacts", "Companies"], "status": [ { "propertyName": "Contacts", "found": true, "pageCount": 45 }, { "propertyName": "OldRelation", "found": false } ] } }
  • found: true - Relation exists and is active
  • found: false - Relation not found in database schema (remove from config)
  • pageCount - Number of pages in the target database available for matching

Owner Resolution

The system automatically determines task ownership from context and resolves it to the correct Notion property format.

Property Type Detection

The Owner property type is detected at runtime from your Todos database schema:

  • People type: Stores Notion workspace user references
  • Relation type: Links to pages in another database (e.g., Contacts)

How Owner is Determined

When generating the AI summary, the system extracts an owner name from context using this priority:

  1. Heading - Preceding heading contains a person's name (e.g., "### Alex")
  2. Contact - Matched contact from fuzzy name matching
  3. Mention - First @mention in the todo text

Resolution by Property Type

If Owner is a People property:

The owner name is matched to a Notion workspace user:

  1. Check @mentions in the todo (already have user IDs)
  2. Check if the block author's name matches (for heading-based ownership)
  3. Search workspace users by name (supports first-name matching)

Example: Owner name "Alex Johnson" matches workspace user "Alex @ Acme Corp" via first-name matching ("Alex" = "Alex").

If Owner is a Relation property:

The owner name is matched to a page in the relation's target database using fuzzy name matching with AI disambiguation if multiple pages match.

Key Insight

The AI's owner name is a search term, not a final value. The same name resolves differently based on property type:

  • "Jane Smith" β†’ Workspace user "Jane" (if Owner is people type)
  • "Jane Smith" β†’ Contacts page "Jane Smith" (if Owner is relation type)

Notion Workspace Members vs Guests

Notion distinguishes between two types of users:

  • Members: Full workspace users with complete access (paid seats)
  • Guests: External collaborators with limited access to specific pages

API Limitation: The Notion API's users.list() endpoint only returns members, not guests. This has important implications:

Property TypeCan match members?Can match guests?
Peopleβœ… Yes❌ No
Relationβœ… Yes (via fuzzy)βœ… Yes (via fuzzy)

If you use a People type for Owner and the owner is a guest user (e.g., an external client), the system cannot find them by name and the Owner field will remain empty.

Choosing Between People and Relation Types

Use People type when:

  • All task owners are full workspace members
  • You want native Notion user avatars and @mention integration
  • Your team is internal-only

Use Relation type when:

  • Task owners include external contacts, clients, or guests
  • You have a Contacts/People database you want to link to
  • You need to track owners who aren't Notion users
  • You want owner matching to work regardless of workspace membership

Recommendation: If you're unsure, use Relation type. It's more flexible and works with both workspace members and external contacts. Create a Contacts database with names, and configure fuzzy matching to link todos automatically.

Example Scenarios

Scenario A: Internal team todos

Owner property: People type
Team members: All are workspace members
Result: βœ… Works perfectly - names matched to workspace users

Scenario B: Client-facing project todos

Owner property: People type
Owners include: External client (guest user)
Result: ❌ Guest not found - guests not returned by users.list()

Fix: Change Owner to Relation type pointing to Contacts database
Result: βœ… Client name matched to Contacts page via fuzzy matching

Scenario C: Mixed internal + external

Owner property: Relation type β†’ Contacts database
Contacts database: Contains both team members and clients
Result: βœ… All names matched via fuzzy matching, regardless of workspace status

Project Matching

When syncing todos to the database, the system automatically links them to related projects using a cascade of matching strategies. Each strategy is tried in order until a match is found.

Note: Strategies 1-2 apply to ALL auto-discovered relations. Strategy 3 applies to relations in TODOS_PROPERTIES.fuzzyMatch. Strategies 4-5 only apply to the relation named in TODOS_PROPERTIES.projects.

Strategy 1: Link Mentions

If the todo text contains a link to a page that exists in your Projects database, the todo is linked to that project.

Example: A todo containing [[Project Alpha]] (a page mention) will be linked to "Project Alpha" if it exists in the Projects database.

Strategy 2: Source Page

If the todo block appears on a page that is itself a project (exists in the Projects database), the todo is linked to that project.

Example: A todo on the "Project Beta" page will be linked to "Project Beta" automatically.

Strategy 4: AI Fuzzy Matching with Date Disambiguation

If strategies 1-3 don't find a match, the system uses OpenAI (via Val Town's @std/openai) to match the todo text against project names and client names.

Initial AI Match:

  • Sends the todo text and list of projects (with client names) to OpenAI (gpt-4o-mini)
  • AI returns one of:
    • A specific project ID (if confident match to a project name)
    • CLIENT:ClientName (if matches a client but not a specific project)
    • NONE (no match)
  • Conservative matching - only links when confident

Date-Based Disambiguation:

When the AI returns a client match (or picks a specific project that has sibling projects for the same client), the system disambiguates using the todo's due date:

  1. Single date match: If the due date falls within exactly one project's date range β†’ use that project
  2. Multiple overlapping dates: If the due date falls within multiple projects' date ranges β†’ second AI call to pick the best semantic fit
  3. No date match: If no project contains the due date β†’ pick project with closest start/end date boundary
  4. No dates on projects: Fall back to most recently edited project

Example - AI picks client, date disambiguates:

Todo: "Review Acme contract for @John due Dec 15"
AI returns: CLIENT:Acme

Projects for Acme:
- "Acme Website Redesign" (Nov 1 - Nov 30) ❌ Dec 15 not in range
- "Acme Q4 Campaign" (Dec 1 - Dec 31) βœ… Dec 15 in range

Result: Linked to "Acme Q4 Campaign"

Example - Overlapping dates, second AI call:

Todo: "this should not go to mission/vision due Nov 28"
AI returns: Dealfront Mission/Vision/Purpose (specific project)

Projects for Dealfront:
- "Dealfront Mission/Vision/Purpose" (Nov 28 - Nov 28) βœ… Nov 28 in range
- "Dealfront Roadmap Strategies" (Nov 18 - Dec 6) βœ… Nov 28 in range

Both overlap! Second AI call with just these 2 candidates:
AI picks: "Dealfront Roadmap Strategies" (better semantic fit based on todo text)

Result: Linked to "Dealfront Roadmap Strategies"

Strategy 5: Parent Block Traversal

If strategies 1-4 don't find a match and the todo is nested under a parent block, the system traverses up the block tree and applies strategies 1-4 to each ancestor.

How it works:

  • Fetches the parent block from Notion
  • Applies strategies 1-4 to the parent's content (using the todo's due date for disambiguation)
  • If no match, moves to grandparent, up to 5 levels
  • Stops when a match is found or reaches page level

Example: A todo nested under a toggle "Project Gamma Tasks" might match to "Project Gamma" via the parent toggle's text.

Configuration

Project matching is configured in notion.config.ts, not environment variables.

Step 1: Add a relation property to your Todos database

  • Create a relation property (e.g., "Projects db") pointing to your Projects database
  • The system auto-discovers all relation properties from the database schema

Step 2: Configure notion.config.ts

// TODOS_PROPERTIES - declare which relation gets AI treatment export const TODOS_PROPERTIES = { // ... other properties projects: "Projects db", // Name of your relation property (null to disable) fuzzyMatch: ["Contacts"], // Relations for fuzzy name matching (see Fuzzy Name Matching section) }; // PROJECTS_PROPERTIES - configure the target database for disambiguation export const PROJECTS_PROPERTIES = { groupBy: "Clients", // Relation to group projects (for AI matching) dateStart: "Dates", // Date property for date-based disambiguation dateEnd: "Dates", // Same property if using date ranges };

Your Projects database should have:

  • Clients relation (or similar) - Groups projects for disambiguation
  • Dates property (date with start/end) - Enables date-based disambiguation

How relation mapping works:

  1. All relation properties in your Todos database are auto-discovered
  2. Basic relations get mention/contextual mapping automatically (Strategies 1-2)
  3. Relations in fuzzyMatch array also get fuzzy name matching (Strategy 3)
  4. The relation named in TODOS_PROPERTIES.projects also gets AI matching (Strategies 4-5)

OpenAI Integration:

  • Uses Val Town's built-in OpenAI integration (@std/openai)
  • Model: gpt-4o-mini (fast, cost-effective)
  • AI calls per todo (when needed):
    1. Fuzzy match disambiguation (when multiple names match - Strategy 3)
    2. Project AI match (when strategies 1-3 fail - Strategy 4)
    3. Date range disambiguation (when multiple projects overlap - Strategy 4)

Getting Started

Prerequisites

  1. Create a Notion integration at https://www.notion.so/my-integrations
  2. Create a Notion database with these properties:
    • Name (title)
    • Block ID (rich_text)
    • Block URL (url)
    • Page URL (url) - source page where todo was found
    • Todo last edited time (date)
    • Owner (people)
    • Other people (people)
    • Due date (date)
    • Links (rich_text)
    • Projects db (relation) - optional, auto-discovered for relation mapping
  3. Share the database with your integration

Setup

  1. Fork this val in Val Town
  2. Set environment variables:
    • NOTION_API_KEY = your integration token
    • TODOS_DB_ID = your database ID
    • Choose a search mode:
      • Keyword mode: SEARCH_KEYWORDS = todo (or your preferred keywords, comma-separated)
      • Block type mode: SEARCH_BLOCK_TYPE = to_do (or your preferred block type)
  3. (Optional) Configure notion.config.ts for project AI matching:
    • Set TODOS_PROPERTIES.projects to your relation property name
    • Configure PROJECTS_PROPERTIES for disambiguation (see Project Matching)
  4. Test with: POST /tasks/todos?hours=1

Usage Examples

Keyword Mode Examples

Find and sync all keyword matches from last 24 hours:

# Set: SEARCH_KEYWORDS=todo curl -X POST https://your-val.express/tasks/todos

Custom time window:

# Set: SEARCH_KEYWORDS=todo curl -X POST "https://your-val.express/tasks/todos?hours=48"

Search for multiple keywords:

# Set: SEARCH_KEYWORDS=todo,zinger,πŸ‹ curl -X POST https://your-val.express/tasks/todos

Block Type Mode Examples

Find and sync all checkbox todos from last 24 hours:

# Set: SEARCH_BLOCK_TYPE=to_do curl -X POST https://your-val.express/tasks/todos

Find all bullet points with people + dates:

# Set: SEARCH_BLOCK_TYPE=bulleted_list_item curl -X POST https://your-val.express/tasks/todos

Other Examples

Get recent pages (API):

curl "https://your-val.express/api/pages/recent?hours=12"

Development Guidelines

  • For project-specific architecture: See CLAUDE.md
  • For Val Town platform guidelines: See AGENTS.md

Tech Stack

  • Runtime: Deno on Val Town
  • Framework: Hono (lightweight web framework)
  • Frontend: React 18.2.0 with Pico CSS (classless CSS framework)
  • APIs:
    • Notion API (@notionhq/client v2)
    • Val Town blob storage
  • Language: TypeScript

Architecture Diagrams

Complete System Flow

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     Notion Workspace                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”‚
β”‚  β”‚  Page A  β”‚  β”‚  Page B  β”‚  β”‚  Page C  β”‚                 β”‚
β”‚  β”‚  "todo"  β”‚  β”‚  "todo"  β”‚  β”‚          β”‚                 β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                           β–Ό
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚  POST /tasks/todos?keyword=todo     β”‚
         β”‚  (Batch Search & Sync Endpoint)     β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚                                     β”‚
        β–Ό                                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Step 1: Search   β”‚              β”‚  Step 3: Sync (Optimized)β”‚
β”‚                   β”‚              β”‚                          β”‚
β”‚ β€’ Get recent pagesβ”‚              β”‚ β€’ Read blobs             β”‚
β”‚ β€’ Fetch all blocksβ”‚              β”‚ β€’ Skip if synced: true   β”‚
β”‚ β€’ Search keywords β”‚              β”‚ β€’ Use cached page ID     β”‚
β”‚ β€’ Extract data    β”‚              β”‚ β€’ Create/update pages    β”‚
β”‚ β€’ Validate        β”‚              β”‚ β€’ Mark synced: true      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚                                   β–²
          β–Ό                                   β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                      β”‚
β”‚  Step 2: Store      β”‚                      β”‚
β”‚                     β”‚                      β”‚
β”‚ β€’ Save to blobs     β”‚β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β€’ Set synced: false β”‚
β”‚ β€’ Compare timestampsβ”‚
β”‚ β€’ Preserve page ID  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚
          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   Val Town Blob Storage                          β”‚
β”‚                                                                  β”‚
β”‚  demo--todo--block-1: { data, sync_metadata: {synced, page_id} }β”‚
β”‚  demo--todo--block-2: { data, sync_metadata: {synced, page_id} }β”‚
β”‚  demo--todo--block-3: { data, sync_metadata: {synced, page_id} }β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

MVC Layer Interaction

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                       HTTP Request                           β”‚
β”‚  POST /tasks/todos?hours=24&keyword=todo                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
                       β–Ό
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚   ROUTE (todos.ts)          β”‚
         β”‚   β€’ Extract query params    β”‚
         β”‚   β€’ Call controller         β”‚
         β”‚   β€’ Format HTTP response    β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
                       β–Ό
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚   CONTROLLER (orchestration)        β”‚
         β”‚   β€’ Validate inputs                 β”‚
         β”‚   β€’ Orchestrate workflow:           β”‚
         β”‚     1. Get recent pages             β”‚
         β”‚     2. Search each page             β”‚
         β”‚     3. Sync to database             β”‚
         β”‚   β€’ Return standardized result      β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
            β”‚           β”‚           β”‚
            β–Ό           β–Ό           β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚ SERVICE:  β”‚ β”‚ SERVICE: β”‚ β”‚ SERVICE:   β”‚
    β”‚ pages.ts  β”‚ β”‚ blob.ts  β”‚ β”‚ database.tsβ”‚
    β”‚           β”‚ β”‚          β”‚ β”‚            β”‚
    β”‚ β€’ API callβ”‚ β”‚ β€’ Blob   β”‚ β”‚ β€’ Query DB β”‚
    β”‚ β€’ Parse   β”‚ β”‚   CRUD   β”‚ β”‚ β€’ Create   β”‚
    β”‚ β€’ Return  β”‚ β”‚ β€’ Return β”‚ β”‚ β€’ Update   β”‚
    β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
          β”‚            β”‚             β”‚
          β–Ό            β–Ό             β–Ό
    Notion API   Blob Storage   Notion API

License

MIT

Code
.claudebackendfrontendshared.vtignoreAGENTS.mdCLAUDE.mdREADME.mddeno.json
H
main.http.tsx
FeaturesVersion controlCode intelligenceCLIMCP
Use cases
TeamsAI agentsSlackGTM
DocsShowcaseTemplatesNewestTrendingAPI examplesNPM packages
PricingNewsletterBlogAboutCareers
We’re hiring!
Brandhi@val.townStatus
X (Twitter)
Discord community
GitHub discussions
YouTube channel
Bluesky
Open Source Pledge
Terms of usePrivacy policyAbuse contact
Β© 2026 Val Town, Inc.