Intelligence Buying Intelligence

Nnamdi Iregbulem recently wrote about how users stick with expensive and slow closed models (ie Opus 4.6) despite cheaper OSS alternatives.

Why aren't we using cheaper models?

Sonnet 4.6 is right there, faster, almost 2x cheaper, and about as good as Opus 4.5, which we were all obsessed with five minutes ago – not to mention OSS models which are 10x cheaper. It's crazy that I'm always reaching for the slowest and most expensive model!

For me, there's extra cognitive dissonance, because not only do I care (like everyone else) about saving money, but I care especially about being in a flow state. Waiting longer for a maybe-smarter response is the opposite of that. Back when Sonnet 3.5 was the best model, I couldn't wait to get my hands on something as smart as it but 10x faster and cheaper. Now we have that. Yet I never reach for it.

Now, I daydream of Opus 4.6 at Taalas's 17k tokens-per-second (Opus 4.6 is ~70 tps, so 250x slower), yet I'm realizing that by the time we get there – say in a year or two – will we even want to use it vs Opus 6.6?

Here's my best explanation for why we keep reaching for the most expensive and slow models, even when cheaper and faster ones are available. There are four forces at play that make intelligence a weird thing to buy:

The returns to intelligence are nonlinear
Intelligence is hard to measure.
Intelligence is competitive.
Intelligence is a commodity. (You can always turn incremental capital into incremental intelligence.)

1. The returns to intelligence are nonlinear

There's a new question tripping up most LLMs right now. It's the new "how many r's in strawberry":

The car wash is 50 meters away. Do you walk or drive?

Sonnet 4.6 and ChatGPT 5.2 tell you to walk.

Opus 4.6 is the only model I know of that correctly tells you to drive ("because you'll need the car"). Once you've seen a model make a mistake like that, it's really hard to look it in the eyes and trust it with anything important.

This is because we know deep down that returns to intelligence are spiky and discontinuous. A tiny increase in intelligence can unlock a perspective that simply doesn't exist at the level below it. The losses from insufficient intelligence are basically unbounded: someone who is fundamentally confused in one seemingly small way can end up in the catastrophically wrong place. And the converse is true too. Intelligence builds on itself. One correct insight early on compounds into a totally different trajectory.

Consider Warren Buffett as the poster child for this. He has a few key insights that are basically invisible to most people, but those insights compound into a completely different trajectory. The difference between Buffett and the average investor may be a 0.001% difference in intelligence, but that can lead to massively different outcomes.

Intelligence isn't about exerting more effort. It's about seeing what's right in front of you more clearly: finding the shortcut, the hidden door, the structural simplification that was invisible a moment ago.

It's also about course-correcting as fast as possible when you realize you're on the wrong path. Feedback loops dominate. Learning compounds.

Intelligence might actually be unique in its nonlinearity. In a David Deutsch sense, it's the only process that creates new explanatory knowledge, the creation of which is inherently unpredictable.

2. Intelligence is hard to measure

And here's what makes this extra tricky: we don't have good evals.

This is true for human evals too: SATs only goes up to 1600. If you have a room full of 1600s, you'd have future Nobel laureates sitting next to future deadbeats. And many Nobel laureates and future self-made billionaires and generation-defining-artists don't even get 1600s! Our tests literally don't measure the intelligence that matters most.

LLMs keep exceeding our benchmarks, so we keep making harder ones. We're finally taking teaching intelligence and measuring it seriously. I wish we were similarly ambitious about human intelligence: "Oh damn, 40% of kids are getting 1600s these days, gotta make SATs harder."

Without great evals, we default to vibes, reputation, brand loyalty, and price signals. Which, honestly, is rational when you can't measure what matters.

The best eval I heard about recently is Demis Hassabis's "Einstein test": train an LLM on all human knowledge up to 1911, then see if it can independently discover general relativity. We need more evals like that.

3. Intelligence is competitive

There's a third reason to always reach for the most expensive intelligence: in many domains, intelligence is a race.

I used to think that once we get models as smart as my CTO, Tom MacWright, we can basically stop there and then work on compressing that intelligence down to 8b parameters and making it faster and cheaper. Give me a country of Tom's in a datacenter.

But in some domains, even if a model (or model + harness) is 0.001% smarter, that could be the difference between winning and losing. In a competitive domain, you don't just need to be good enough. You need to be better than everyone else.

My dad likes to tell the story of two hikers who see a bear. One hiker starts running, and the other says "you can't outrun a bear!" The first hiker says:

"I don't have to outrun the bear, I just have to outrun you."

If someone could give you a consistent edge in the stock market, even a small one, that edge could be worth a trillion dollars. But only if nobody else has it. The moment everyone has the same insight, it's priced-in and worthless. For the stock market, you don't just need intelligence – you need more intelligence than everyone else.

I'm not sure to what extent this is true in startups – classic YC advice is to ignore competition. But in the attention economy, you definitely need to be better than everyone else. If your content is 0.01% better than the next best thing, that could be the difference between going viral and being ignored.

4. You can always spend incremental dollars to get incremental intelligence

Right now the most expensive models aren't actually that expensive. A few bucks per conversation. But that's changing fast.

We keep finding ways to spend more money for more intelligence: longer chains of thought, parallel reasoning, models that loop and self-correct for hours instead of seconds. Each of these multiplies the compute bill – on purpose.

There's a great example of this called the Ralph Wiggum loop, coined by Geoffrey Huntley. The idea is simple: keep running an AI agent in a loop, feeding failures back in, until it succeeds. You're literally turning compute dollars into results through brute-force persistence. It's not elegant, but it works – and it means that if the stakes are high enough, you can always throw more money at a problem.

We're not just going to make intelligence cheaper. We're going to make it more expensive on purpose – because we'll keep finding ways to spend more compute on harder problems. The ceiling on what you can pay for intelligence is going to go way up. And if the returns are nonlinear, people will pay it. Gladly.

Nobody ever went bankrupt spending too much on good counsel

A very-well-paid consiglieri once told me:

Nobody ever went bankrupt spending too much on good counsel, but plenty did trying to save on it.

Why are lawyers so expensive? Why doesn't competition drive down the price?

Because all four forces are at play.

The returns are nonlinear: you're paying for the one sentence that changes the contract, the one clause that closes a loophole.
It's hard to measure the quality of legal counsel until you end up in court, and by then it's too late. So you rely on vibes, reputation, and price signals.
It's directly competitive – you need counsel at least as good as the other side's.
Legal intelligence is a commodity – you can always throw more at the problem: more associates, more partners, outside specialists. There's basically no ceiling.

Put another way: you're not paying for the hours. You're paying for the possibility of unlocking a secret.

Once you have this perspective, you realize that law firms already run a model router internally. When you email your law firm, they route your request to the best-priced and fastest (ie someone with available capacity) intelligence for your problem. Simple incorporation? Junior associate. Routine contract review? Mid-level. Bet-the-company litigation? Senior partner.

The flippening

When a new tool is weaker than you, you use it like a full-time employee. You give it bounded tasks. You double-check its work. You keep the real decisions for yourself. That's how we use LLMs today, and it works great.

But this only makes sense in a world where intelligence is both cheaper and worse than yours. And that world is temporary.

Now imagine models + harnesses that are smarter than you. Not just faster, not just more encyclopedic, but actually better at reasoning, better at synthesis, better at seeing structure.

As discussed above, they've also become more expensive and slower than you. You won't treat them like interns anymore. You have to treat them like partners. You have to give them only the hardest problems, the ones that actually require intelligence, and let them do their thing.

I've been thinking about this as the Lawyer Flippening.

Right now most people use AI like middle managers. Tools like Devin spin up sub-agents in parallel. Gastown orchestrates 20-30 Claude Code instances on the same codebase. You assign tickets, let them grind, check their work, assign more tickets.

But the future looks different. The frontier model will be the most expensive "employee" in the room, even more expensive and slower than you, the human. This super expensive model will be sitting at the top of the pyramid, with a swarm of cheaper agents – and humans – beneath it.

This is kinda how companies already work. The CEO isn't even the top. The CEO is just the highest-paid full-time employee. Above them are all the sources of intelligence you can't afford at full-time rates, but that are still crucial: consultants, advisors, board members, investors.

As the CEO of Val Town, a small startup, I try to push work towards the intelligence that can do it most efficiently (and honestly, joyfully, because these are humans, not clankers), either on my team or of our advisors, lawyers, investors, etc.

Structurally, this will be true as LLM intelligence gets better. We'll all want partner-level intelligence, but because it's so expensive, we'll have to route work to it strategically. The model router will be more important than ever.

The LLM that pays for its own inference: how much should you spend on intelligence?

Here's a thought experiment I can't stop turning over. Imagine giving an LLM a $100k bank account. Its only job: earn enough to pay for its own inference and generate a return. It's basically a hedge fund of one. And suddenly it's facing the exact optimization problem we all face: how much of your capital do you burn on intelligence vs. keep as returns?

Hedge funds charge "2 and 20" – 2% management fee, 20% of profits. Should our LLM hedge fund do the same? Allocating 2% of its capital under management to its own inference? But is that actually optimal, or just convention?

What is the optimal ratio of sharpening the axe to chopping wood? How much should you spend on thinking vs. doing?

The returns to intelligence are so nonlinear that investing in meta-work – better tools, better systems, better agent harnesses – can feel incredibly justified. And sometimes it is justified. But if you take it too seriously, you enter a cognitive black hole where you spend all your time becoming smarter for an application stage that never comes. It's yak-shaving all the way down.

Our self-supporting LLM can't yak-shave forever, because its inference costs are real and its returns are measurable. Every hour it spends upgrading itself is an hour it's not earning. That pricing mechanism – having to internalize the cost of your own intelligence and improving that intelligence vs the opportunity cost of earning more money with our current intelligence – is exactly what's missing when a human disappears into tool configuration for weeks. Nobody is billing you for your own thinking time, so there's no natural brake.

But sometimes sharpening the axe is the point. Douglas Engelbart spent his entire career on this. He called it "bootstrapping": using your current tools to build better tools, which you then use to build even better tools. It's the original intelligence-buying-intelligence loop that kicked off the information age. The highest-leverage use of intelligence is building tools that make intelligence more effective.

This is what I've spent my career on. Like Notion, Val Town is a tool for making tools. The whole "tools for thought" tradition is really just the tradition of intelligence investing in itself at the societal level. (Not every company in the world could or should be in the tools for thought space, but some meaningful percentage should!)

Here's the fun part: watch what this self-supporting LLM would independently re-invent.

It would want more money under management to amortize the fixed costs of its own thinking, so it reinvents fundraising so it can manage other people's money too (collect more management fees).

It has to decide what insights to share vs. gate-keep – it reinvents marketing vs proprietary research.

It might want to recruit other LLMs or humans for specialized tasks – it reinvents the firm.

Intelligence generates capital. Capital buys more intelligence. More intelligence generates more capital. The loop compounds.

This is the big loop that Anthropic and OpenAI and all frontier labs are running right now: use capital to build intelligence, use the resulting capital and intellgience to buy more compute, build even more intelligence.

Sam Altman has said that intelligence and energy are the two assets that matter, and really intelligence is primary because intelligence is what gets you more energy. In this system, capital is just a proxy for energy. The more capital you have, the more energy you can buy, and the more intelligence you can build. The intelligence-capital loop is the fundamental economic engine of our time.

I love that with this single constraint – the LLMs gotta eat – you can see how all institutions we've already built (hedge funds, law firms, startups, venture capital) are just different shapes this loop naturally takes.

artivilla

stevekrouse_dot_com