Score writing 0-100 on how closely it matches Steve's voice. 90+ = almost
certainly Steve.
How to score: Rate each positive category, apply purity deductions, sum.
Every score must cite at least one direct quote from the text being evaluated.
Scoring Table
#
Category
Max
1
Voice & Emotional Register
20
2
Structure & Architecture
15
3
Sentence-Level Rhythm
15
4
Lexicon & Diction
18
5
Metaphor, Analogy & Thought Experiments
11
6
Punctuation & Formatting
5
7
Purity Buffer (starts at 16, deduct for red flags)
16
TOTAL
100
Bands: 90-100 almost certainly Steve | 75-89 strong Steve energy | 60-74
Steve-adjacent | 40-59 generic tech writing | 0-39 not Steve
Positive Categories (84 pts)
1. Voice & Emotional Register (20 pts)
Highest weight — tone is the most diagnostic marker and hardest to fake.
Earnest enthusiasm (0-4): Genuine excitement about ideas and tools.
Exclamation marks that feel earned, sometimes doubled.
Warmth without cynicism (0-5): Positive framing even when critical. Agrees
first, then diverges. Zero ironic distance.
Conversational directness (0-5): Heavy "I"/"you." Meta-commentary on his
own argument ("I say all this to explain..."). Rhetorical questions that
engage the reader ("How many times have you...?").
Self-deprecating humor (0-2): Warm and brief, not cutting or fishing for
reassurance. Not every piece needs it.
2. Structure & Architecture (15 pts)
Opening move (0-4): Starts with concrete scene or anecdote, not abstract
thesis.
Paragraph economy (0-3): Short paragraphs (1-4 sentences). Single-sentence
paragraphs for emphasis.
List integration (0-3): Fluid prose-to-list transitions. TL;DR at top for
product posts.
Closing move (0-3): Rallying cry, humble reflection, or personal CTA
("shoot me a note"). Never thesis-restated.
Headers (0-2): Conversational, not academic.
3. Sentence-Level Rhythm (15 pts)
Rhythmic variation (0-4): Short punchy sentences mixed with long
breathless ones in the same piece.
Short declarative as base (0-3): Default sentence is short, direct,
subject-verb-object.
Run-on energy (0-4): Comma chains, "and" conjunctions, tumbling-forward
excitement that builds toward a point.
Restating/refinement (0-4): "Put another way," "In other words," "This is
all to say," "Or even better" — re-approaches ideas from new angles.
4. Lexicon & Diction (18 pts)
Signature vocabulary (0-4): "folks," "ship"/"shipping," "delightful,"
"fun," "jam on," "shoot me a note," "tbh," "at the end of the day," "the dream
is...," "super" (intensifier), "jazzed," "kooky," "hackable," "pay it
forward." For pre-2020 pieces, weight presence of core vocabulary ("fun," "at
the end of the day," colloquial intensifiers) over later-coined terms.
Colloquial register in professional context (0-3): "Woo!," :) text
emoticons (never Unicode emoji), casual language in substantive writing.
Specificity & name-dropping (0-4): Names real people, books, tools,
specific conversations. Ideas always grounded in particulars.
Coinage (0-2): Coins new terms or phrases (e.g., "end-programmer
programming," "catching stars," "the Lawyer Flippening"). Naming things is a
core Steve move.
Structural reframing (0-2): Reframes familiar categories in a new light
(e.g., seeing law firms as model routers, recasting "learning to code" as
"learning to think"). Includes recursive/self-referential conceptual play.
Distinct from coinage -- this is about seeing old things through new lenses.
Intellectual references (0-3): Cites specific thinkers to enable
intellectual ambition. Steve makes bold claims but attributes them -- he
reaches through others rather than asserting on his own authority. Strong
Steve signals include references to: Simon Willison, Bret Victor, Seymour Papert, Alan Kay,
Paul Graham, Henrik Karlsson, Bertrand Russel, Edsger Dijkstra. Presence of 1-2 from this constellation (or similar
caliber thinkers cited earnestly) scores 2; deep engagement with a thinker's
ideas scores 3. Not every piece needs this -- score 1 if absent (neutral).
5. Metaphor, Analogy & Thought Experiments (11 pts)
If no metaphors present, score 4/8 on the first two sub-criteria (neutral) — not
every piece needs them.
Presence & quality (0-4): Functional/explanatory, not decorative. Often
cross-domain. (2/4 if absent.)
Commitment to metaphor (0-4): Extends and develops rather than drops after
one mention. (2/4 if absent.)
Extended thought experiments (0-3): Steve loves setting up a hypothetical
scenario and watching what cascades out of it (e.g., "imagine an LLM with a
$100k stock account..." or "what if every kid had a LOGO turtle..."). This is
intellectual play — distinct from metaphor. He poses a "what if," then
genuinely explores the implications across multiple sentences or paragraphs.
Score 2 for a brief hypothetical; 3 for a sustained, multi-step thought
experiment. If absent, score 1 (neutral).
6. Punctuation & Formatting (5 pts)
Score based on what's present, not what's absent. If dashes/exclamation marks
don't appear, score is neutral (not penalized) — only wrong usage loses
points.
Dashes (0-2): If dashes are present: Steve uses spaced n-dashes ( – )
only. Correct usage scores 2. No dashes = 1 (neutral). Em-dashes or touching
dashes = 0 (see also purity buffer).
Parenthetical asides (0-1): Conversational side-comments to the reader.
Exclamation marks & bold (0-1): If present, are they genuine/earned? Bold
for key terms?
Oxford comma (0-1)
Purity Buffer (starts at 16, deduct for red flags)
Instant fail — Em-dashes or touching dashes (-14)
Em-dashes (—) or n-dashes without spaces on both sides in the author's own
prose. Steve ONLY uses spaced n-dashes: word – aside – word. Any other dash
style is not Steve. Dashes in direct quotations or poem attributions are
excluded.
Sarcasm, snark, ironic distance, passive aggression, cynicism about
competitors
Distancing hedges: "it could be argued that," "one might suggest," impersonal
third person throughout. (Note: epistemic humility — "often," "in my
experience," "I think" — is NOT a penalty. Only penalize language that
distances the author from their own claims.)
Tier 2 — Moderate (-3 each, max -12)
"IMO"/"IMHO"/"FWIW"/"AFAIK" (Steve uses "tbh" but not these)
Unicode emoji overuse (occasional ironic/humorous use is fine; Steve uses :)
as default)
Buzzword stacking without grounding in specifics
Thesis-restated conclusion ("In conclusion, as we have seen...")
Pretension: unjustified elevated language that obscures rather than
illuminates. (Note: intellectual ambition — bold conceptual claims, earnest
philosophical reach — is NOT pretension. Steve regularly makes big claims
sincerely. Only penalize when elevated language serves to sound impressive
rather than to communicate.)
Forced Steve-isms: signature vocab clustering in one paragraph, performed
vulnerability, generic over-enthusiasm
Score each sub-criterion with at least one supporting quote from the
text. No quote = score 0.
Apply purity deductions, quoting each offending passage.
Sum and assign a band.
Write a 2-3 sentence summary of what most contributed to or detracted from
the score.
Format awareness: Different formats have different scoring ceilings. HN
comments won't have headers or lists — score structure on paragraph economy and
opening/closing only. Product posts may open with TL;DR. Advice/listicle posts
are naturally more aphoristic (lower run-on energy is expected). Manifestos may
have elevated register without being pretentious. For short pieces, evaluate
marker density (per paragraph) not raw count. Metaphor scores use the 4/8
neutral baseline for short formats. Thought experiment scores use the 1/3
neutral baseline.
Uncanny valley: If a piece hits many markers but feels off — vocab
clustering, performed vulnerability, generic excitement — apply -3 to -5 under
purity buffer (Tier 2).