• Townie
    AI
  • Blog
  • Docs
  • Pricing
  • We’re hiring!
Log inSign up
yawnxyz

yawnxyz

groq-docs

Public
Like
groq-docs
Home
Code
14
answer
9
data
search
16
testing
7
utils
1
.vtignore
AGENTS.md
README.md
deno.json
groq.ts
H
main.tsx
todo.md
urls.ts
utils.ts
Branches
1
Pull requests
Remixes
History
Environment variables
Val Town is a collaborative website to build and scale JavaScript apps.
Deploy APIs, crons, & store data – all from the browser, and deployed in milliseconds.
Sign up now
Code
/
todo.md
Code
/
todo.md
Search
…
Viewing readonly version of main branch: v102
View latest version
todo.md

todos

  • try search strategies
    • add kapa.ai as baseline
    • create a test suite -> doc with all the comparisons
    • actually make it flexible / selectable again instead of comment out for easier benchmarking
  • markdown endpoint
  • pre-calculate metadata around each markdown page
  • pre-calculate complex question query q&a pairs that map to various pages and categories

future

  • (Ben) can we have a blind eval ranker - aka we show the two results side by side and you pick A or B and do this like 20 times, then it shows you which one ranked better

thoughts

  • for TINY DATA SETS you do very different things than massive data sets; vectorize, mixedbread, turbopuffer are all set up for MASSIVE data sets
  • doing 2sec responses is EASY; sub 1-sec prob requires vps
  • network hops + isolate warmups kill latency - cosine dist calc and embeddings calc don't matter as much
  • small/cheap embeddings seem to not be a problem (?? quality??)
  • generating embeddings locally w/ a small model requires downloading a model (size+computationally expensive) while generating embeddings from an API costs around ~600ms
  • loading an 80mb xenova--all-miniLM is really good/fast but you it sucks for serverless and for mobile users to download that
    • possibly best way is to host this somewhere w/ the onnx and everything saved locally and you can just run it
  • cloudflare ai embeddings are very fast, but worker needs to be warmed up; if not warm then expect ~800ms-2000ms
    • if you ping the cf ai embeddings with a fake request for warmup then it's fast (e.g. while a user is typing)
  • for testing at least, loading in a massive json file into memory takes a very long time 10+ seconds
FeaturesVersion controlCode intelligenceCLI
Use cases
TeamsAI agentsSlackGTM
DocsShowcaseTemplatesNewestTrendingAPI examplesNPM packages
PricingNewsletterBlogAboutCareers
We’re hiring!
Brandhi@val.townStatus
X (Twitter)
Discord community
GitHub discussions
YouTube channel
Bluesky
Open Source Pledge
Terms of usePrivacy policyAbuse contact
© 2025 Val Town, Inc.