# AI Foundation Models Large-scale AI models trained on broad data that can be adapted to many downstream tasks. The term was coined by Stanford HAI (Human-Centered AI Institute) in 2021. Examples: GPT-4, Claude, Llama, Gemini, Stable Diffusion, DALL-E. Key properties: - **Trained at massive scale**: billions of parameters, trained on internet-scale data - **General-purpose**: not designed for a single task - **Adaptable**: can be specialized via fine-tuning or prompting - **Emergent capabilities**: abilities that appear at scale but weren't explicitly trained for (e.g., chain-of-thought reasoning, in-context learning) Foundation models represent a paradigm shift from task-specific to general-purpose AI. Instead of training a separate model for each task, a single foundation model serves as the base layer. [[Large Language Models (LLMs)]], image generators, and multimodal models are all built on this approach. They are the core of [[Generative AI (Gen AI)]]. Their power comes from [[Deep Learning]] architectures trained at scale, and their practical utility comes from adaptation techniques like fine-tuning and prompting. ## References - Term coined by Stanford HAI in the 2021 paper "On the Opportunities and Risks of Foundation Models" ## Related - [[Large Language Models (LLMs)]] - [[Generative AI (Gen AI)]] - [[Deep Learning]] - [[AI Fine-Tuning]]