llms.txt convention - DeveloPassion

# llms.txt convention `/llms.txt` is an emerging Web convention for serving **AI-readable summaries** of a website's content. It sits at the root of a domain — `https://example.com/llms.txt` — and points an LLM (or an agent crawling on its behalf) at the canonical, structured Markdown view of what's on the site, without the navigation chrome, ad rails, and HTML noise of the human-facing pages. A companion file `/llms-full.txt` ships the *full* content concatenated into one Markdown document. Same idea, different scope: `llms.txt` is the index; `llms-full.txt` is the bundle. Proposed by Jeremy Howard (Answer.AI) in late 2024, the convention spread through 2025–2026 and is now shipped by major doc sites: Anthropic, Vercel, Cloudflare, FastHTML, Hermes Agent (Nous Research), and many more. Tooling support exists in Mintlify, Docusaurus, VitePress, and Nextra. There is no W3C ratification — this is a community standard that earned adoption by being useful. ## Why it matters LLMs ingest websites through one of two doors: a tool call (browser, fetch, scraper) or a pre-cached training corpus. Both pay an extraction tax — HTML to text, navigation to noise, rich layouts to flat prose. The result is fragile context and wasted tokens. `/llms.txt` removes the tax. The site author publishes the canonical version once; every consuming LLM gets the same clean Markdown. Three benefits: 1. **Lower token cost.** A 5,000-word article rendered as HTML is often 30–60K tokens after navigation, ads, and footer markup. The same article in `llms-full.txt` is 5–7K tokens. Sometimes 10× cheaper to ingest. 2. **Higher fidelity.** The author controls what the LLM sees — no ambiguity from JavaScript-rendered content, no confusion from nav links, no missing content behind tabs or accordions. 3. **Better citations.** Models that consume the canonical Markdown can cite specific sections with stable anchors instead of paraphrasing. ## Format Both files are plain Markdown — no special schema. The `llms.txt` index typically follows a soft convention: ```markdown # Project Name > One-line description of what the project / site is about. ## Docs - [Quickstart](https://example.com/quickstart): how to get started - [API Reference](https://example.com/api): full API surface - [Concepts](https://example.com/concepts): mental models ## Optional - [Changelog](https://example.com/changelog): release history ``` The `llms-full.txt` is the same docs concatenated into one Markdown stream, ordered for narrative flow. ## Where it fits in PKM For the [[Personal Knowledge Management (PKM)|PKM]] practitioner publishing knowledge online, `llms.txt` is the **AI-readable layer** of the [[AI-Ready Second Brain]]. The HTML pages serve humans; `llms.txt` and `llms-full.txt` serve agents. Both consume the same Markdown source — just rendered through different paths. If your knowledge graph is already plain Markdown ([[Atomic notes]], [[Knowledge Graph (KG)]]), publishing `llms-full.txt` is mostly a build-time concatenation. Static-site generators with native support (Mintlify, Quartz plugins, etc.) make it a one-config-line change. ## Examples in the wild - Anthropic Docs: <https://docs.claude.com/llms.txt> - Hermes Agent: <https://hermes-agent.nousresearch.com/llms.txt> and `/llms-full.txt` - Cloudflare Workers: <https://developers.cloudflare.com/llms.txt> - Curated registry: <https://llmstxt.site/> ## Trade-offs - **No enforcement.** Like `robots.txt`, this is a polite suggestion. Misbehaving crawlers will still scrape HTML. - **Maintenance overhead.** Two surfaces to keep in sync if you don't auto-generate. - **Spec drift.** Different sites use different shapes for the index. Until something canonical settles, expect minor inconsistencies. - **Discoverability.** Most LLMs don't auto-fetch `/llms.txt`. Agents need to be taught to look for it (system-prompt rule, MCP tool, harness convention). ## References - Original proposal: <https://llmstxt.org/> - Jeremy Howard announcement (2024-09): <https://www.answer.ai/posts/2024-09-03-llmstxt.html> - Live registry of sites that ship it: <https://llmstxt.site/> ## Related - [[AI-Ready Second Brain]] - [[Personal Knowledge Management (PKM)]] - [[Atomic notes]] - [[Knowledge Graph (KG)]] - [[Context Engineering]] - [[Context-as-Code]] - [[Hermes Agent]] - [[Model Context Protocol (MCP)]]