• Townie
    AI
  • Blog
  • Docs
  • Pricing
  • We’re hiring!
Log inSign up
wxw

wxw

scrape-hws

Public
Like
scrape-hws
Home
Code
5
README.md
database-schema.sql
C
reddit-scraper.ts
H
scraper-api.ts
H
test-reddit-oauth.ts
Branches
1
Pull requests
Remixes
History
Environment variables
4
Val Town is a collaborative website to build and scale JavaScript apps.
Deploy APIs, crons, & store data – all from the browser, and deployed in milliseconds.
Sign up now
Code
/
Code
/
Search
https://wxw--2280f0403a6411f0a8529e149126039e.web.val.run
README.md

Reddit /r/hardwareswap Scraper

A Val Town application that scrapes posts from Reddit's /r/hardwareswap subreddit and stores them in a Supabase database.

Features

  • πŸ”„ Automated scraping of /r/hardwareswap posts using Reddit's official OAuth API
  • πŸ—„οΈ Stores posts in Supabase PostgreSQL database
  • 🚫 Duplicate detection to avoid storing the same post twice
  • πŸ“Š Detailed logging and statistics
  • ⏰ Runs on a cron schedule (configurable in Val Town UI)
  • πŸ” Secure OAuth authentication with automatic token refresh

Setup Instructions

1. Reddit API Setup

  1. Go to Reddit App Preferences
  2. Click "Create App" or "Create Another App"
  3. Fill out the form:
    • Name: Your app name (e.g., "Val Town Scraper")
    • App type: Select "script"
    • Description: Optional description
    • About URL: Leave blank or add your website
    • Redirect URI: Use http://localhost:8080 (required but not used)
  4. Click "Create app"
  5. Note down your Client ID (under the app name) and Client Secret

2. Supabase Setup

  1. Create a new project in Supabase
  2. Go to the SQL Editor in your Supabase dashboard
  3. Copy and paste the contents of database-schema.sql and run it
  4. Go to Settings > API to get your project URL and anon key

3. Environment Variables

Set these environment variables in your Val Town settings:

Supabase:

  • SUPABASE_URL: Your Supabase project URL (e.g., https://your-project.supabase.co)
  • SUPABASE_ANON_KEY: Your Supabase anon/public key

Reddit API:

  • REDDIT_CLIENT_ID: Your Reddit app's client ID
  • REDDIT_CLIENT_SECRET: Your Reddit app's client secret
  • REDDIT_USER_AGENT: Optional custom user agent (defaults to "Val Town Reddit Scraper 1.0")

4. Configure Cron Schedule

  1. Set reddit-scraper.ts as a cron trigger in Val Town
  2. Configure the schedule in the Val Town web UI (recommended: every 30 minutes)
  3. Example cron expressions:
    • Every 30 minutes: */30 * * * *
    • Every hour: 0 * * * *
    • Every 15 minutes: */15 * * * *

Database Schema

The posts table contains:

  • id: Primary key (auto-increment)
  • reddit_id: Unique Reddit post ID
  • reddit_original: Full Reddit post data as JSON
  • title: Post title
  • created_at: When the post was created on Reddit
  • updated_at: When the record was last updated in our database

Usage

Manual Run

You can manually trigger the scraper by running the reddit-scraper.ts val.

Automated Run

Once configured as a cron job, it will automatically:

  1. Authenticate with Reddit using OAuth client credentials
  2. Fetch the latest 25 posts from /r/hardwareswap
  3. Check for duplicates in the database
  4. Save new posts to Supabase
  5. Log statistics about the scraping session

Technical Details

OAuth Flow

The scraper uses Reddit's Client Credentials OAuth flow:

  1. Authenticates using your app's client ID and secret
  2. Receives an access token from Reddit
  3. Uses the token to make authenticated API requests
  4. Automatically refreshes the token if it expires

This approach is more reliable than using Reddit's public JSON endpoints and respects Reddit's rate limits.

Rate Limiting

  • Reddit allows 60 requests per minute for OAuth applications
  • The scraper fetches 25 posts per run, well within limits
  • Recommended cron schedule: every 30 minutes or longer

Monitoring

Check the Val Town logs to monitor:

  • Number of new posts scraped
  • Number of duplicates skipped
  • Any errors during scraping
  • Performance metrics

Troubleshooting

Common Issues

  1. Missing environment variables: Ensure all required Reddit and Supabase credentials are set
  2. Database connection errors: Verify your Supabase credentials and that the table exists
  3. Reddit OAuth errors: Check your Reddit app credentials and ensure the app type is "script"
  4. Rate limiting: Reddit may temporarily block requests if rate limits are exceeded
  5. Duplicate key errors: The scraper checks for duplicates, but race conditions might occur

Error Messages

  • Missing Supabase credentials: Set SUPABASE_URL and SUPABASE_ANON_KEY
  • Missing Reddit credentials: Set REDDIT_CLIENT_ID and REDDIT_CLIENT_SECRET
  • OAuth error: Check your Reddit app credentials and app type
  • Reddit API error: May indicate rate limiting or API issues
  • Error saving post: Check Supabase connection and table schema

Data Access

You can query your scraped data directly in Supabase:

-- Get recent posts SELECT title, created_at, reddit_original->>'score' as score FROM posts ORDER BY created_at DESC LIMIT 10; -- Search posts by title SELECT title, created_at FROM posts WHERE title ILIKE '%gpu%' ORDER BY created_at DESC; -- Get posts by author SELECT title, created_at FROM posts WHERE reddit_original->>'author' = 'username' ORDER BY created_at DESC;

Contributing

Feel free to modify the scraper to:

  • Add more subreddits
  • Include additional post metadata
  • Add data processing or analysis features
  • Integrate with other services
HTTP
  • scraper-api.ts
    wxw--22…9e.web.val.run
  • test-reddit-oauth.ts
    wxw--2a…9e.web.val.run
Cron
  • reddit-scraper.ts
Code
README.mddatabase-schema.sql
C
reddit-scraper.ts
H
scraper-api.ts
H
test-reddit-oauth.ts
FeaturesVersion controlCode intelligenceCLI
Use cases
TeamsAI agentsSlackGTM
DocsShowcaseTemplatesNewestTrendingAPI examplesNPM packages
PricingNewsletterBlogAboutCareers
We’re hiring!
Brandhi@val.townStatus
X (Twitter)
Discord community
GitHub discussions
YouTube channel
Bluesky
Terms of usePrivacy policyAbuse contact
Β© 2025 Val Town, Inc.