name:
Issue Tracker Pipeline Refactor
overview:
Refactor the cron jobs into a composable pipeline architecture where sources, filters, and sinks are decoupled, allowing arbitrary combinations like "stalest issues with specific labels".
todos:
id:
pipeline-types
content:
Create `backend/pipeline/types.ts` with core interfaces (GitHubIssue, IssueSource, IssueFilter, IssueSink, LayerConfig)
status:
pending
id:
pipeline-runner
content:
Create `backend/pipeline/runner.ts` with the core runPipeline() function and dedup/send logic
status:
pending
id:
source-all
content:
Create `backend/pipeline/sources/allSinceLastRun.ts` - extract from main.cron + third-layer
status:
pending
id:
source-stalest
content:
Create `backend/pipeline/sources/stalestSnapshot.ts` - extract from only-oldest-issues
status:
pending
id:
filter-label
content:
Create `backend/pipeline/filters/byLabel.ts` - extract from third-layer lines 27-67
status:
pending
id:
filter-staleness
content:
Create `backend/pipeline/filters/byStaleness.ts` - extract from only-oldest-issues lines 118-151
status:
pending
id:
filter-age
content:
Create `backend/pipeline/filters/byCreatedAge.ts` - extract from main.cron lines 81-93
status:
pending
id:
sink-single
content:
Create `backend/pipeline/sinks/singleGoal.ts`
status:
pending
id:
sink-routed
content:
Create `backend/pipeline/sinks/labelRouted.ts`
status:
pending
id:
migrate-main
content:
Migrate main.cron.tsx to use pipeline
status:
pending
id:
migrate-stalest
content:
Migrate only-oldest-issues.cron.tsx to use pipeline
status:
pending
id:
migrate-tagged
content:
Migrate third-layer.cron.tsx to use pipeline
status:
pending

Pipeline Architecture for GitHub Issue Tracking

Problem

The three cron jobs duplicate the same core loop (fetch, filter, dedup, send, record) with variations in:

  • Source: How to determine "since when" to fetch
  • Filter: Which issues qualify (stalest, labeled, all)
  • Sink: Which Beeminder goal(s) to send to

Adding a "stalest + labeled" layer requires merging two separate files manually.

Proposed Architecture

Rendering mermaid diagram...

Key Components

1. backend/pipeline/types.ts - Core Interfaces

interface GitHubIssue { number: number; title: string; updated_at: string; created_at: string; labels?: Array<{ name: string } | string>; } interface IssueSource { name: string; getIssuesSinceLastRun(ctx: PipelineContext): AsyncIterable<GitHubIssue[]>; updateState(ctx: PipelineContext): Promise<void>; } interface IssueFilter { name: string; filter(issues: GitHubIssue[], ctx: PipelineContext): Promise<GitHubIssue[]>; } interface IssueSink { name: string; getGoalForIssue(issue: GitHubIssue): string | null; } interface LayerConfig { name: string; // e.g., "tagged-stalest" source: IssueSource; filters: IssueFilter[]; // Applied in order sink: IssueSink; }

2. backend/pipeline/sources/ - Issue Sources

3. backend/pipeline/filters/ - Composable Filters

4. backend/pipeline/sinks/ - Beeminder Destinations

  • singleGoal.ts - Always routes to one configured goal
  • labelRouted.ts - Routes based on label matching rules

5. backend/pipeline/runner.ts - The Core Engine

This replaces the duplicated loop in all three cron files:

export async function runPipeline(layer: LayerConfig): Promise<void> { const ctx = createContext(layer); // 1. Get issues from source const rawIssues = await collectIssues(layer.source, ctx); // 2. Apply filters in sequence let issues = rawIssues; for (const filter of layer.filters) { issues = await filter.filter(issues, ctx); } // 3. Dedup and send to sink for (const issue of issues) { const goal = layer.sink.getGoalForIssue(issue); if (!goal) continue; if (await alreadySentToday(issue.number, goal)) continue; await sendToBeeminder(issue, goal); await recordSent(issue.number, goal); } // 4. Update source state for next run await layer.source.updateState(ctx); }

6. Simplified Cron Files

Each cron file becomes a thin configuration layer:

// main.cron.tsx import { runPipeline } from "./backend/pipeline/runner.ts"; import { allSinceLastRun } from "./backend/pipeline/sources/allSinceLastRun.ts"; import { byCreatedAge } from "./backend/pipeline/filters/byCreatedAge.ts"; import { singleGoal } from "./backend/pipeline/sinks/singleGoal.ts"; export default () => runPipeline({ name: "updatedIssues", source: allSinceLastRun("main"), filters: [byCreatedAge()], sink: singleGoal(config.beeminder.updatedIssues), });
// fourth-layer.cron.tsx (the new "stalest + tagged" layer) export default () => runPipeline({ name: "tagged-stalest", source: stalestSnapshot(), filters: [byStaleness(), byLabel(taggedRules)], sink: labelRouted(taggedRules), });

Migration Path

  1. Create the pipeline infrastructure without changing existing crons
  2. Migrate one cron at a time (start with main.cron.tsx as simplest)
  3. Verify each migration with existing behavior
  4. Add new layers by composing existing components

File Structure After Refactor

backend/
  pipeline/
    types.ts
    runner.ts
    context.ts
    sources/
      allSinceLastRun.ts
      stalestSnapshot.ts
    filters/
      byLabel.ts
      byStaleness.ts
      byCreatedAge.ts
    sinks/
      singleGoal.ts
      labelRouted.ts
  database/
    dailyTracking.ts   (unchanged)
    lastRunState.ts    (unchanged)
    stalestSnapshot.ts (unchanged)