# LLM Wiki An LLM Wiki is a pattern for building persistent, compounding knowledge bases maintained entirely by [[Large Language Models (LLMs)]]. The concept was proposed by [[Andrej Karpathy]] in April 2026. Instead of humans manually curating notes, the LLM handles all the tedious bookkeeping: writing summaries, maintaining cross-references, resolving contradictions, and keeping an index up to date. The human's role shifts to source discovery and strategic questioning. ## Three-Layer Architecture The system is organized in three layers: - **Raw Sources**: an immutable collection of documents (articles, papers, repos, datasets, images). The LLM reads but never modifies these - **The Wiki**: LLM-generated [[Markdown]] files organized as entity pages, concept pages, summaries, and interconnected analyses. The LLM owns maintenance entirely - **The Schema**: a configuration document (e.g., CLAUDE.md, AGENTS.md) that defines wiki structure, conventions, and workflows, guiding disciplined LLM operation ## Core Operations - **Ingest**: process new sources by extracting takeaways, writing summaries, updating cross-references across existing pages, and appending to a chronological log - **Query**: search relevant wiki pages and synthesize answers with citations. Valuable outputs become new wiki pages, compounding knowledge over time - **Lint**: health-check the wiki for contradictions, stale claims, orphan pages, and missing cross-references ## Special Files The wiki relies on two key structural files: - **index.md**: a content-organized catalog listing each page with one-line summaries and metadata, organized by category - **log.md**: a chronological append-only record of all operations, with parseable prefixes (e.g., `## [2026-04-02] ingest | Article Title`) ## Tooling Karpathy uses [[Obsidian]] as the "IDE frontend" for viewing raw data, the compiled wiki, and derived visualizations. The [[Obsidian Web Clipper]] extension converts web articles into markdown files for ingestion. Additional vibe-coded tools include a naive search engine (usable via web UI or CLI by the LLM). ## Key Insight LLMs excel at the mechanical bookkeeping that humans reliably abandon: updating references, maintaining consistency across hundreds of pages, noting contradictions, and imputing missing data. This frees humans to focus on curation, analysis, and asking the right questions. The wiki becomes a compounding artifact where cross-references and synthesis accumulate without fatigue. ## Critique: Compilation vs Curation Steven Thompson pushes back on the LLM Wiki pattern from a [[Personal Knowledge Management (PKM)]] standpoint: it risks confusing **compilation** with **curation**. A compiler can maintain structure; only a human can maintain significance — sensing the *weight* of an idea, recognizing when understanding shifts, identifying which fragments are formative versus merely interesting. The reframe: a PKM system is a **biography of understanding**, not a database. The aim isn't a perfectly coherent repository of everything one has read; it's a record of how one's thinking changes over time. Automate the maintenance and one risks automating away the meaning-making itself. As Thompson puts it: *"I'm not building a repository of everything I know; I'm building a record of how my understanding changes."* This raises the **delegation question**: how much of the bookkeeping can be handed to the LLM before the wiki stops being *mine* and becomes a tidy, generic compilation indistinguishable from anyone else's? The friction of manual curation — choosing, weighing, connecting — is part of the work, not an obstacle to it. A useful heuristic when ingesting into an LLM Wiki: distinguish *information* from *intellectual lineage*. Information belongs in a compilation; only intellectual lineage belongs in a biography of understanding. In practice, this argues for hybrid use — let the LLM compile the breadth (entity pages, summaries, cross-references) while reserving the [[Zettelkasten method]] core (permanent notes, folgezettel chains, weighted connections) for human curation. ## Relation to RAG Karpathy notes that at the scale of ~100 articles / ~400K words, a well-maintained wiki with auto-maintained index files outperforms [[Retrieval-Augmented Generation (RAG)]] for Q&A. The LLM can navigate the file structure directly rather than relying on embedding-based retrieval. ## Farzapedia Farza (FarzaTV) independently built a similar system called [[Farzapedia]]: a personal Wikipedia generated from 2,500 diary entries, Apple Notes, and iMessage conversations, producing 400 detailed articles with backlinks. His key insight is that the wiki is built for the agent, not the human. The file-system structure with backlinks is easily crawlable by any agent, making it a more effective [[Knowledge Graph (KG)]] than RAG-based approaches. ## Future Direction Karpathy envisions synthetic data generation and fine-tuning to embed wiki knowledge directly into model weights rather than relying solely on context windows. He also sees room for a dedicated product rather than a collection of scripts. ## Quotes - [[Tools to enhance understanding]] — [[Andrej Karpathy]] ## References - https://x.com/karpathy/status/2039805659525644595 - https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f - https://x.com/omarsar0/status/2040099881008652634 - https://x.com/FarzaTV/status/2040563939797504467 - https://academy.dair.ai/blog/llm-knowledge-bases-karpathy - https://medium.com/a-voice-in-the-conversation/the-goal-is-curation-not-compilation-llm-wiki-6f90f829b15d — Steven Thompson, "The Goal is Curation, not Compilation" ## Related - [[Andrej Karpathy]] - [[Large Language Models (LLMs)]] - [[LLM Knowledge Bases Over Unstructured Data]] - [[Retrieval-Augmented Generation (RAG)]] - [[Agentic Knowledge Management (AKM)]] - [[Personal Knowledge Management (PKM)]] - [[Obsidian]] - [[Obsidian Web Clipper]] - [[Knowledge Graph (KG)]] - [[Markdown]] - [[Farzapedia]] - [[Marp]] - [[Compounding Knowledge]] - [[Graph Explorer Base View plugin for Obsidian]] - [[AI Verifiability]] - [[AI Verifiability as a Capability Ceiling]] - [[Menugen Architecture Pattern]] - [[Markdown-based Installation (MD Scripts)]] - [[Agent-Native Product Decomposition]]