• Blog
  • Docs
  • Pricing
  • We’re hiring!
Log inSign up
stevekrouse

stevekrouse

microgpt

Public
Like
microgpt
Home
Code
2
README.md
main.ts
Environment variables
Branches
2
Pull requests
Remixes
History
Val Town is a collaborative website to build and scale JavaScript apps.
Deploy APIs, crons, & store data – all from the browser, and deployed in milliseconds.
Sign up now
Code
/
README.md
Code
/
README.md
Search
…
Viewing readonly version of main branch: v16
View latest version
README.md

microGPT.ts

A TypeScript port of Karpathy's microGPT — but written in the style of Conal Elliott's denotational design: types first, meanings first, implementation as a consequence.

Karpathy's original microGPT is a single Python file that trains and inferences a GPT with zero dependencies. As he put it: "This is the full algorithmic content of what is needed. Everything else is just efficiency." This port preserves that spirit but raises the level of abstraction — leaning into TypeScript's type system to make the structure of a language model legible, not just the math.

Denotational Design, Applied

Conal Elliott's core idea: give a simple mathematical meaning (denotation) for each type, then define operations as if they work on meanings, not representations. The implementation is free to differ for efficiency, but must be observationally equivalent to the denotation.

Here, the "meanings" are:

TypeDenotation (meaning)
TensorA shaped array of scalars with attached gradient and backward function — i.e., a node in a computation graph
ModelSpecThe what of a transformer: vocab size, dimensions, heads, layers — a pure description with no behavior
ModelA triple of (spec, initParams, forward) — a model is its specification plus two functions
TrainedA triple of (tokenizer, model, params) — a frozen snapshot: everything needed to generate

The key move: Model is not a class with hidden state. It's a plain record of functions. makeTransformerLanguageModel takes a ModelSpec and returns a Model — a function from specification to behavior. This is the denotational design pattern: separate the what (spec) from the how (init + forward), and make the connection between them explicit and total.

Architecture

Rendering mermaid diagram...

Forward Pass Detail

Each transformer layer follows the now-standard pre-norm pattern:

Rendering mermaid diagram...

The Seven Components

Following Karpathy's decomposition — every LLM has exactly these parts, and nothing else:

  1. Dataset — ~32k names fetched from Karpathy's makemore repo
  2. Tokenizer — Character-level: 26 letters + 1 BOS/EOS token (vocab size 27)
  3. Autograd — Micrograd-style reverse-mode AD on flat Float32Array tensors
  4. Architecture — 1-layer GPT-2-style transformer (RMSNorm, causal attention, ReLU MLP, weight tying)
  5. Loss — Cross-entropy over next-token predictions
  6. Optimizer — Adam with bias correction and linear learning rate decay
  7. Sampling — Temperature-controlled autoregressive generation

Running

This is a Val Town script val. Run it directly — it will train for 1000 steps on CPU and generate 20 sample names:

num docs: 32033
vocab size: 27
num params: 4795
step    1 / 1000 | loss 3.5062 | 0.0s
step  101 / 1000 | loss 2.7573 | 1.2s
...
step 1000 / 1000 | loss 2.2891 | 11.8s

--- generation ---
sample  1: malede
sample  2: jara
sample  3: kaylin
...

Hyperparameters

Matched to Karpathy's defaults:

ParameterValueNotes
dModel16Embedding dimension
nHeads4Attention heads (head dim = 4)
nLayers1Transformer blocks
dFF64FF hidden dim (4× dModel)
maxLen8Context window
steps1000Training iterations
learningRate0.01With linear decay to 0
seed42Deterministic initialization

Why TypeScript?

Karpathy's Python version is the irreducible essence of a language model. This version asks: what if we took that essence and gave it more structure? TypeScript's interfaces (ModelSpec, Model, Trained) make the architecture of the architecture visible — you can see the separation of concerns that's implicit in the Python version.

In Conal Elliott's terms: the Python version is the implementation, this version tries to also show the denotation.

FeaturesVersion controlCode intelligenceCLIMCP
Use cases
TeamsAI agentsSlackGTM
DocsShowcaseTemplatesNewestTrendingAPI examplesNPM packages
PricingNewsletterBlogAboutCareers
We’re hiring!
Brandhi@val.townStatus
X (Twitter)
Discord community
GitHub discussions
YouTube channel
Bluesky
Open Source Pledge
Terms of usePrivacy policyAbuse contact
© 2026 Val Town, Inc.