Jagged Intelligence - DeveloPassion

# Jagged Intelligence Jagged intelligence is the now-standard term for the discontinuous capability profile of [[Large Language Models (LLMs)|LLMs]]: the same model can perform at superhuman levels on one task and embarrassingly fail on an adjacent one. The canonical example, repeated by [[Andrej Karpathy]], is a model that coherently refactors a 100,000-line codebase or finds a zero-day vulnerability, then turns around and tells you to walk fifty meters to the car wash to wash your car. The term sharpens what was previously called "uneven capability" or "spiky benchmarks". *Jagged* captures both the unpredictability and the *adjacency* of the failure: the failing task often looks indistinguishable, to a human, from the task the model just nailed. ## Why Models Are Jagged Jaggedness is not a defect of any individual model; it is a structural feature of how frontier capability is manufactured. The mechanism is captured in [[AI Verifiability as a Capability Ceiling]]: - **Verifiability** decides whether RL signal is even possible. - **Training attention** decides whether the lab invested. - **Data coverage** decides whether the latent capability exists in pretraining. - **Economic value** decides whether the lab will keep investing. Where all four align, capability climbs predictably. Where any one is missing, capability stalls or regresses. Two adjacent-looking tasks can sit on opposite sides of any of these factors, producing the cliff. [[Reinforcement Learning From Human Feedback (RLHF)]] and (especially) RLVR (Reinforcement Learning from Verifiable Rewards) amplify the effect: the lab pours compute where the reward signal is cheap and clean, leaving everything else to whatever pretraining left behind. ## Why It Matters in Practice - **Demos lie**: every demo selects an on-rails task. The off-rails task is one prompt away. - **Capability transfer is weak**: strength on Task A says nothing about Task B if they sit on different sides of the four-factor grid. - **Founder leverage is in the off-rails niches**: frontier labs will not enter low-economic-value, low-verifiability domains. That space is a real moat for vertical applications, but only if you build the verification scaffolding the lab won't. - **Eval design must probe the cliffs**: a benchmark that only tests the on-rails case tells you nothing about real-world reliability. ## Mental Model Treat the model not as a single capability surface but as a *terrain*. Some valleys are paved highways; others are jungle. The job of the engineer (and the user) is to learn the terrain map for their domain. There is no shortcut around this; the model will not warn you when you cross from highway to jungle, because it has no way of knowing. ## Related - [[AI Verifiability as a Capability Ceiling]] - [[AI Verifiability]] - [[Andrej Karpathy]] - [[Large Language Models (LLMs)]] - [[Reinforcement Learning From Human Feedback (RLHF)]] - [[AI Reasoning Models]] - [[Agentic Engineering]] - [[Understanding Bottleneck]] - [[Agent Development Lifecycle (ADLC)]]