Intelligence Buying Intelligence

Nnamdi Iregbulem recently wrote about how users overwhelmingly stick with the most expensive and slow closed models (ie Opus 4.6) despite cheaper and faster OSS alternatives. Reading it, I had this nagging feeling: why aren't we using cheaper models? Sonnet 4.6 is right there, faster, almost 2x cheaper, and about as good as Opus 4.5, which we were all obsessed with five minutes ago – not to mention OSS models which are 10x cheaper. It's crazy that I'm always reaching for the slowest and most expensive model!

For me, there's extra cognitive dissonance, because not only do I care (like everyone else) about saving money, but I care especially about being in a flow state, and waiting longer for a maybe-smarter response is the opposite of that. Back when Sonnet 3.5 was the best model, I couldn't wait to get my hands on something as smart as it but 10x or 100x faster and cheaper. Now we have that, yet I never reach for it. Now I daydream of Opus 4.6 at Taalas's 17k tokens-per-second (Opus 4.6 is ~70 tps, so 250x slower), yet I'm realizing that by the time we get there – say in a year or two – will we even want to use it vs Opus 6.6?

Here's my best explanation for why we keep reaching for the most expensive and slow models, even when cheaper and faster ones are available. There are four forces at play that make intelligence a weird thing to buy:

The returns to intelligence are nonlinear.
We don't have good measures of intelligence.
Intelligence is competitive.
You can always turn more capital into more intelligence.

1. The returns to intelligence are nonlinear

There's a new question tripping up most LLMs right now. It's the new "how many r's in strawberry":

The car wash is 50 meters away. Do you walk or drive?

Sonnet 4.6 and ChatGPT 5.2 tell you to walk.

Opus 4.6 is the only model I know of that correctly tells you to drive ("because you'll need the car"). Once you've seen a model make a mistake like that, it's really hard to look it in the eyes and trust it with anything important.

This is because returns to intelligence are spiky and discontinuous. A tiny increase in intelligence can unlock a perspective that simply doesn't exist at the level below it. The losses from insufficient intelligence are basically unbounded: someone who is fundamentally confused in one seemingly small way can end up in the catastrophically wrong place. And the converse is true too. Intelligence builds on itself. One correct insight early on compounds into a totally different trajectory.

Consider Warren Buffett as the poster child for this. He has a few key insights that are basically invisible to most people, but those insights compound into a completely different trajectory. The difference between Buffett and the average investor may be a 0.001% difference in intelligence, but that can lead to massively different outcomes.

Intelligence isn't about exerting more effort. It's about seeing what's right in front of you more clearly: finding the shortcut, the hidden door, the structural simplification that was invisible a moment ago.

It's also about course-correcting as fast as possible when you realize you're on the wrong path. Feedback loops dominate. Learning compounds.

Intelligence might actually be unique in its nonlinearity. In a David Deutsch sense, it's the only process that creates new explanatory knowledge, the creation of which is inherently unpredictable.

2. We don't have good measures of intelligence.

And here's what makes this extra tricky: we don't have good evals.

This is true for human evals too. SAT only goes up to 1600. If you have a room full of 1600s, you'd have future Nobel laureates sitting next to future deadbeats. And many Nobel laureates and future self-made billionaires and generation-defining-artists don't even get 1600s! Our tests literally don't measure the intelligence that matters most.

LLMs keep exceeding our benchmarks, so we keep making harder ones. We're finally taking teaching intelligence and measuring it seriously. Imagine if we did the same with humans: "Oh damn, 40% of kids are getting 1600s these days, gotta make SATs harder!"

Without great evals, we default to vibes, reputation, brand loyalty, and price signals. Which, honestly, is rational when you can't measure what matters.

The best eval I heard about recently is Demis Hassabis's "Einstein test": train an LLM on all human knowledge up to 1911, then see if it can independently discover general relativity. We need more evals like that.

3. Intelligence is competitive

There's a third reason to always reach for the most expensive intelligence: in many domains, intelligence is a race.

I used to think that once we get models as smart as my CTO, Tom MacWright, we can basically stop there and then work on compressing that intelligence down to 8b parameters and making it faster and cheaper. Give me a country of Tom's in a datacenter.

But in some domains, even if a model (or model + harness) is 0.001% smarter, that could be the difference between winning and losing. In a competitive domain, you don't just need to be good enough. You need to be better than everyone else.

My dad likes to tell the story of two hikers who see a bear. One hiker starts running, and the other says "you can't outrun a bear!" The first hiker says:

"I don't have to outrun the bear, I just have to outrun you."

If someone could give you a consistent edge in the stock market, even a small one, that edge could be worth a trillion dollars. But only if nobody else has it. The moment everyone has the same insight, it's priced-in and worthless. For the stock market, you don't just need intelligence – you need more intelligence than everyone else.

I'm not sure to what extent this is true in startups – classic YC advice is to ignore competition. But in the attention economy, you definitely need to be better than everyone else. If your content is 0.01% better than the next best thing, that could be the difference between going viral and being ignored.

4. You can always turn more capital into more intelligence

Right now the most expensive models aren't actually that expensive. A few bucks per conversation. But that's changing fast.

We keep finding ways to spend more money for more intelligence: longer chains of thought, parallel reasoning, models that loop and self-correct for hours instead of seconds. Each of these multiplies the compute bill – on purpose.

There's a great example of this called the Ralph Wiggum loop, coined by Geoffrey Huntley. The idea is simple: keep running an AI agent in a loop, feeding failures back in, until it succeeds. You're literally turning compute dollars into results through brute-force persistence. It's not elegant, but it works – and it means that if the stakes are high enough, you can always throw more money at a problem.

We're not just going to make intelligence cheaper. We're going to make it more expensive on purpose – because we'll keep finding ways to spend more compute on harder problems. The ceiling on what you can pay for intelligence is going to go way up. And if the returns are nonlinear, people will pay it. Gladly.

Nobody ever went bankrupt spending too much on good counsel

A very-well-paid consiglieri once told me:

Nobody ever went bankrupt spending too much on good counsel, but plenty did trying to save on it.

Why are lawyers so expensive? Why doesn't competition drive down the price?

Because all four forces are at play.

The returns are nonlinear: you're paying for the one sentence that changes the contract, the one clause that closes a loophole. You can't easily tell the difference between a good lawyer and a great one until the moment it matters – so you default to price and reputation.
You can't easily measure the quality of legal counsel until you end up in court, and by then it's too late. So you rely on vibes, reputation, and price signals.
It's directly competitive – you need counsel at least as good as the other side's.
You can always throw more intelligence at the problem – more associates, more partners, outside specialists. There's basically no ceiling.

Put another way: you're not paying for the hours. You're paying for the possibility of unlocking a secret.

Once you have this perspective, you realize that law firms already run a model router internally. When you email your law firm, they route your request to the best-priced and fastest (ie someone with available capacity) intelligence for your problem. Simple incorporation? Junior associate. Routine contract review? Mid-level. Bet-the-company litigation? Senior partner.

The flippening

When a new tool is weaker than you, you use it like a full-time employee. You give it bounded tasks. You double-check its work. You keep the real decisions for yourself. That's how we use LLMs today, and it works great.

But this only makes sense in a world where intelligence is both cheaper and worse than yours. And that world is temporary.

Now imagine models + harnesses that are smarter than you. Not just faster, not just more encyclopedic, but actually better at reasoning, better at synthesis, better at seeing structure.

As discussed above, they've also become more expensive and slower than you. You won't treat them like interns anymore. You have to treat them like partners. You have to give them only the hardest problems, the ones that actually require intelligence, and let them do their thing.

I've been thinking about this as the Lawyer Flippening.

Right now most people use AI like middle managers. Tools like Devin spin up sub-agents in parallel. Gastown orchestrates 20-30 Claude Code instances on the same codebase. You assign tickets, let them grind, check their work, assign more tickets.

But the future looks different. The frontier model will be the most expensive "employee" in the room, even more expensive and slower than you, the human. This super expensive model will be sitting at the top of the pyramid, with a swarm of cheaper agents – and humans – beneath it.

This is kinda how companies already work. The CEO isn't even the top. The CEO is just the highest-paid full-time employee. Above them are all the sources of intelligence you can't afford at full-time rates, but that are still crucial: consultants, advisors, board members, investors.

As the CEO of Val Town, a small startup, I try to push work towards the intelligence that can do it most efficiently (and honestly, joyfully, because these are humans, not clankers), either on my team or of our advisors, lawyers, investors, etc.

Structurally, this will be true as LLM intelligence gets better. We'll all want partner-level intelligence, but because it's so expensive, we'll have to route work to it strategically. The model router will be more important than ever.

Compete on the edge, collaborate on everything else

Ok but here's the thing – sometimes the smartest move isn't to reach for more intelligence. It's to share intelligence. Standardize it, make it free. And it's not charity. It's economic self-interest.

Every dollar of intelligence you waste on solved problems is a dollar you can't spend on unsolved ones. The YC SAFE is the perfect example – it saves oh-so-expensive partner time at Gunderson for every startup that uses it. YC gave it away for free and everyone got richer, including YC. Same with C-corps as a legal structure, HTTP, TCP/IP, open source software. These are all cases where someone compressed expensive reasoning into a free protocol, and the whole economy benefited. Open protocols are intelligence savings that compound across an entire industry.

This gives you a sharp heuristic: compete on the intelligence that gives you an edge, collaborate on everything else.

It's the model router again, but at an industry level – route the solved problems to free protocols, save the expensive intelligence for what actually matters. This is actually a strong economic argument for supporting open source and open protocols. Not out of idealism, but because standardizing the plumbing frees up intelligence for the work that actually differentiates, making us all richer.

How much should you spend on thinking?

Here's a thought experiment I can't stop turning over. Imagine giving an LLM a $100k stock market account. Its only job: earn enough to pay for its own inference and generate a return. It's basically a hedge fund of one. And suddenly it's facing the exact optimization problem we all face: how much of your capital do you burn on intelligence vs. keep as returns?

Hedge funds charge "2 and 20" – 2% management fee, 20% of profits. Should our LLM hedge fund do the same? Allocating 2% of its capital under management to its own inference? But is that actually optimal, or just convention?

What is the optimal ratio of sharpening the axe to chopping wood? How much should you spend on thinking vs. doing?

The returns to intelligence are so nonlinear that investing in meta-work – better tools, better systems, better agent harnesses – can feel incredibly justified. And sometimes it is justified. But if you take it too seriously, you enter a cognitive black hole where you spend all your time becoming smarter for an application stage that never comes. It's yak-shaving all the way down.

This is exactly why the hedge fund framing is useful. Our LLM hedge fund can't yak-shave forever, because its inference costs are real and its returns are measurable. Every hour it spends upgrading itself is an hour it's not earning. That pricing mechanism – having to internalize the cost of your own intelligence and improving that intelligence vs the opportunity cost of earning more money with our current intelligence – is exactly what's missing when a human disappears into tool configuration for weeks. Nobody is billing you for your own thinking time, so there's no natural brake.

But here's the thing – sometimes sharpening the axe is the point. Douglas Engelbart spent his entire career on this. He called it "bootstrapping" – using your current tools to build better tools, which you then use to build even better tools. It's the original intelligence-buying-intelligence loop that kicked off the information age. The highest-leverage use of intelligence is building tools that make intelligence more effective.

This is what I've spent my career on. Val Town is a tool for making tools: a platform where you use code to build more code, where the infrastructure for thinking is itself the product. The whole "tools for thought" tradition is really just the tradition of intelligence investing in itself at the societal level. (Not every company in the world could or should be in the tools for thought space, but some meaningful percentage should!)

Here's the fun part: watch what this LLM hedge fund would independently re-invent.

It would want more money under management to amortize the fixed costs of its own thinking, so it reinvents fundraising so it can manage other people's money too (collect more management fees).

It has to decide what insights to share vs. gate-keep – it reinvents proprietary research vs marketing.

It might want to recruit other LLMs or humans for specialized tasks – it reinvents the firm.

Intelligence generates capital. Capital buys more intelligence. More intelligence generates more capital. The loop compounds.

This is the big loop that Anthropic and OpenAI and all frontier labs are running right now: use intelligence to build more intelligence, use the resulting capital to buy more compute, build even more intelligence.

Sam Altman has said that intelligence and energy are the two assets that matter, and really intelligence is primary because intelligence is what gets you more energy. In this system, capital is just a proxy for energy. The more capital you have, the more energy you can buy, and the more intelligence you can build. The intelligence-capital loop is the fundamental economic engine of our time.

I love that with this single constraint – the LLMs gotta eat – you can see how all institutions we've already built (hedge funds, law firms, startups, venture capital) are just different shapes this loop naturally takes.

The bigger thing

I started this essay trying to explain a narrow personal puzzle: why do I keep reaching for Opus when Sonnet is right there? I did not expect to end up staring at a recursive loop between intelligence and capital that seems to explain... most of the institutions we've built? Law firms, hedge funds, venture capital, open source.

They're not arbitrary. They're the shapes that intelligence-capital optimization naturally takes. An LLM with a bank account would seemingly reinvent all of them from scratch.

But here's what I can't stop thinking about. "How much should you spend on intelligence?" is itself an intelligence question. You need intelligence to answer it. And the answer changes as you get smarter. The target moves every time you move toward it.

stevekrouse

stevekrouse_dot_com