Copy and paste the contents of database-schema.sql and run it
Go to Settings > API to get your project URL and anon key
3. Environment Variables
Set these environment variables in your Val Town settings:
Supabase:
SUPABASE_URL: Your Supabase project URL (e.g., https://your-project.supabase.co)
SUPABASE_ANON_KEY: Your Supabase anon/public key
Reddit API:
REDDIT_CLIENT_ID: Your Reddit app's client ID
REDDIT_CLIENT_SECRET: Your Reddit app's client secret
REDDIT_USER_AGENT: Optional custom user agent (defaults to "Val Town Reddit Scraper 1.0")
4. Configure Cron Schedule
Set reddit-scraper.ts as a cron trigger in Val Town
Configure the schedule in the Val Town web UI (recommended: every 30 minutes)
Example cron expressions:
Every 30 minutes: */30 * * * *
Every hour: 0 * * * *
Every 15 minutes: */15 * * * *
Database Schema
The posts table contains:
id: Primary key (auto-increment)
reddit_id: Unique Reddit post ID
reddit_original: Full Reddit post data as JSON
title: Post title
created_at: When the post was created on Reddit
updated_at: When the record was last updated in our database
Usage
Manual Run
You can manually trigger the scraper by running the reddit-scraper.ts val.
Automated Run
Once configured as a cron job, it will automatically:
Authenticate with Reddit using OAuth client credentials
Fetch the latest 25 posts from /r/hardwareswap
Check for duplicates in the database
Save new posts to Supabase
Log statistics about the scraping session
Technical Details
OAuth Flow
The scraper uses Reddit's Client Credentials OAuth flow:
Authenticates using your app's client ID and secret
Receives an access token from Reddit
Uses the token to make authenticated API requests
Automatically refreshes the token if it expires
This approach is more reliable than using Reddit's public JSON endpoints and respects Reddit's rate limits.
Rate Limiting
Reddit allows 60 requests per minute for OAuth applications
The scraper fetches 25 posts per run, well within limits
Recommended cron schedule: every 30 minutes or longer
Monitoring
Check the Val Town logs to monitor:
Number of new posts scraped
Number of duplicates skipped
Any errors during scraping
Performance metrics
Troubleshooting
Common Issues
Missing environment variables: Ensure all required Reddit and Supabase credentials are set
Database connection errors: Verify your Supabase credentials and that the table exists
Reddit OAuth errors: Check your Reddit app credentials and ensure the app type is "script"
Rate limiting: Reddit may temporarily block requests if rate limits are exceeded
Duplicate key errors: The scraper checks for duplicates, but race conditions might occur
Error Messages
Missing Supabase credentials: Set SUPABASE_URL and SUPABASE_ANON_KEY
Missing Reddit credentials: Set REDDIT_CLIENT_ID and REDDIT_CLIENT_SECRET
OAuth error: Check your Reddit app credentials and app type
Reddit API error: May indicate rate limiting or API issues
Error saving post: Check Supabase connection and table schema
Data Access
You can query your scraped data directly in Supabase:
-- Get recent postsSELECT title, created_at, reddit_original->>'score'as score
FROM posts
ORDERBY created_at DESC
LIMIT 10;
-- Search posts by titleSELECT title, created_at
FROM posts
WHERE title ILIKE '%gpu%'ORDERBY created_at DESC;
-- Get posts by authorSELECT title, created_at
FROM posts
WHERE reddit_original->>'author'='username'ORDERBY created_at DESC;