Reddit AITA Database Extraction Script

This script extracts the top 100 most popular posts (by score) and all their comments from the AmItheAsshole subreddit database to a new, smaller SQLite database file.

Prerequisites

  • Deno installed on your system
  • Original database file at db/AmItheAsshole.sqlite

Usage

Run the script with the following command:

deno run --allow-read --allow-write extract_top_posts.ts

Alternatively, you can make the script executable and run it directly:

chmod +x extract_top_posts.ts ./extract_top_posts.ts

What the Script Does

  1. Opens the source database (db/AmItheAsshole.sqlite)
  2. Creates a new target database (db/Top100AmItheAsshole.sqlite)
  3. Extracts the top 100 posts from the source database based on score
  4. Extracts all comments associated with those posts
  5. Preserves all original data including IDs, timestamps, and relationships

Database Schema

The script maintains the original schema structure:

Submission Table

CREATE TABLE IF NOT EXISTS "submission" ( "id" INTEGER NOT NULL PRIMARY KEY, "submission_id" VARCHAR(255) NOT NULL, "title" TEXT NOT NULL, "selftext" TEXT NOT NULL, "created_utc" DATE NOT NULL, "permalink" TEXT NOT NULL, "score" INTEGER NOT NULL ); CREATE UNIQUE INDEX "submission_submission_id" ON "submission" ("submission_id");

Comment Table

CREATE TABLE IF NOT EXISTS "comment" ( "id" INTEGER NOT NULL PRIMARY KEY, "submission_id" INTEGER NOT NULL, "message" TEXT NOT NULL, "comment_id" VARCHAR(255) NOT NULL, "parent_id" VARCHAR(255) NOT NULL, "created_utc" DATE NOT NULL, "score" INTEGER NOT NULL, FOREIGN KEY ("submission_id") REFERENCES "submission" ("id") ); CREATE INDEX "comment_submission_id" ON "comment" ("submission_id"); CREATE UNIQUE INDEX "comment_comment_id" ON "comment" ("comment_id");