• Blog
  • Docs
  • Pricing
  • We’re hiring!
Log inSign up
wxw

wxw

scrape-hws

Public
Like
scrape-hws
Home
Code
4
README.md
database-schema.sql
main.tsx
reddit-scraper.ts
Branches
1
Pull requests
Remixes
History
Environment variables
4
Val Town is a collaborative website to build and scale JavaScript apps.
Deploy APIs, crons, & store data – all from the browser, and deployed in milliseconds.
Sign up now
Code
/
README.md
Code
/
README.md
Search
5/26/2025
Viewing readonly version of main branch: v4
View latest version
README.md

Reddit /r/hardwareswap Scraper

A Val Town application that scrapes posts from Reddit's /r/hardwareswap subreddit and stores them in a Supabase database.

Features

  • πŸ”„ Automated scraping of /r/hardwareswap posts
  • πŸ—„οΈ Stores posts in Supabase PostgreSQL database
  • 🚫 Duplicate detection to avoid storing the same post twice
  • πŸ“Š Detailed logging and statistics
  • ⏰ Runs on a cron schedule (configurable in Val Town UI)

Setup Instructions

1. Supabase Setup

  1. Create a new project in Supabase
  2. Go to the SQL Editor in your Supabase dashboard
  3. Copy and paste the contents of database-schema.sql and run it
  4. Go to Settings > API to get your project URL and anon key

2. Environment Variables

Set these environment variables in your Val Town settings:

  • SUPABASE_URL: Your Supabase project URL (e.g., https://your-project.supabase.co)
  • SUPABASE_ANON_KEY: Your Supabase anon/public key

3. Configure Cron Schedule

  1. Set reddit-scraper.ts as a cron trigger in Val Town
  2. Configure the schedule in the Val Town web UI (recommended: every 30 minutes)
  3. Example cron expressions:
    • Every 30 minutes: */30 * * * *
    • Every hour: 0 * * * *
    • Every 15 minutes: */15 * * * *

Database Schema

The posts table contains:

  • id: Primary key (auto-increment)
  • reddit_id: Unique Reddit post ID
  • reddit_original: Full Reddit post data as JSON
  • title: Post title
  • created_at: When the post was created on Reddit
  • updated_at: When the record was last updated in our database

Usage

Manual Run

You can manually trigger the scraper by running the reddit-scraper.ts val.

Automated Run

Once configured as a cron job, it will automatically:

  1. Fetch the latest 25 posts from /r/hardwareswap
  2. Check for duplicates in the database
  3. Save new posts to Supabase
  4. Log statistics about the scraping session

API Alternative

If you prefer to use Reddit's official API instead of the JSON endpoint:

  1. Create a Reddit app at https://www.reddit.com/prefs/apps
  2. Add these environment variables:
    • REDDIT_CLIENT_ID
    • REDDIT_CLIENT_SECRET
    • REDDIT_USER_AGENT
  3. Modify the scraper to use the official Reddit API

Monitoring

Check the Val Town logs to monitor:

  • Number of new posts scraped
  • Number of duplicates skipped
  • Any errors during scraping
  • Performance metrics

Troubleshooting

Common Issues

  1. Missing environment variables: Ensure SUPABASE_URL and SUPABASE_ANON_KEY are set
  2. Database connection errors: Verify your Supabase credentials and that the table exists
  3. Reddit rate limiting: The scraper uses a 25-post limit and respectful user agent
  4. Duplicate key errors: The scraper checks for duplicates, but race conditions might occur

Error Messages

  • Missing Supabase credentials: Set the required environment variables
  • Reddit API error: Check if Reddit is accessible and not rate limiting
  • Error saving post: Check Supabase connection and table schema

Data Access

You can query your scraped data directly in Supabase:

-- Get recent posts SELECT title, created_at, reddit_original->>'score' as score FROM posts ORDER BY created_at DESC LIMIT 10; -- Search posts by title SELECT title, created_at FROM posts WHERE title ILIKE '%gpu%' ORDER BY created_at DESC; -- Get posts by author SELECT title, created_at FROM posts WHERE reddit_original->>'author' = 'username' ORDER BY created_at DESC;

Contributing

Feel free to modify the scraper to:

  • Add more subreddits
  • Include additional post metadata
  • Add data processing or analysis features
  • Integrate with other services
FeaturesVersion controlCode intelligenceCLIMCP
Use cases
TeamsAI agentsSlackGTM
DocsShowcaseTemplatesNewestTrendingAPI examplesNPM packages
PricingNewsletterBlogAboutCareers
We’re hiring!
Brandhi@val.townStatus
X (Twitter)
Discord community
GitHub discussions
YouTube channel
Bluesky
Open Source Pledge
Terms of usePrivacy policyAbuse contact
Β© 2025 Val Town, Inc.