# Chris Olah Co-founder of [[Anthropic]] and one of the pioneers of **mechanistic interpretability** — the program of reverse-engineering neural networks into human-understandable algorithms. Self-described on X as: *"Reverse engineering neural networks at @AnthropicAI. Previously @distillpub, OpenAI Clarity Team, Google Brain. Personal account."* Based in San Francisco. Canadian, born 1992/1993; left university at 18 and pursued independent research on a **Thiel Fellowship**. He's the closest thing the field has to a public-facing interpretability scholar — his explainer posts (notably on Distill) are the canonical entry points for anyone learning the topic. ## Career arc - **Thiel Fellowship** (post-school) — funded independent research after leaving university - **Google Brain** — early neural-network visualization work; involved in DeepDream (2015) - **Distill.pub** — co-founder and editor of the interactive ML research journal; raised the bar for how ML research can be communicated visually - **OpenAI Clarity Team** — interpretability research before the founder-team split - **Anthropic** — co-founded the company; runs the interpretability research direction ## Research focus - **Mechanistic interpretability** — reverse-engineering trained networks into explicit circuits and features - **Circuits research** — the line of work that decomposes vision models, then transformer models, into human-readable computational steps - **Feature visualization** — methods for seeing what individual neurons and feature groups respond to - **Superposition and features** — recent emphasis on how models pack many concepts into shared parameters ## Notable threads worth knowing - The **Distill "Circuits"** series — multi-author project decomposing InceptionV1 into named features and motifs - The **Anthropic Transformer Circuits** thread — extending the same program to language models - His writing style (long-form, illustrated, mathematically precise) defined a generation of ML explainers ## Recognition - **TIME100 AI** (2024) - Billionaire status from Anthropic equity (Forbes, 2025) - Frequently cited as one of the most influential interpretability researchers active today ## Why he matters (to track) - One of the few founders publicly committed to interpretability as the path to safe AI, not just capabilities. Watching his research direction is a useful lens on how seriously [[Anthropic]] is treating its safety thesis - His public writing remains the single best introduction to neural-network internals — pointing newcomers to colah.github.io and the Transformer Circuits work saves hours - The Distill / Anthropic style of "research as interactive explainer" is something to study and adopt for [[PKM]] / teaching content ## Quotes <!-- QueryToSerialize: LIST FROM #type/quote AND [[Chris Olah]] WHERE public_note = true SORT file.name ASC --> ## Books <!-- QueryToSerialize: LIST FROM #type/book AND [[Chris Olah]] WHERE public_note = true SORT file.name ASC --> ## Related - [[Anthropic]] - [[OpenAI]] - [[AI Safety]] ## References - Personal site / blog: https://colah.github.io/ - About: https://colah.github.io/about.html - X: https://x.com/ch402 - GitHub: https://github.com/colah - Distill: https://distill.pub/ - Transformer Circuits Thread: https://transformer-circuits.pub/ - Wikipedia: https://en.wikipedia.org/wiki/Chris_Olah