# AI Foundation Models
Large-scale AI models trained on broad data that can be adapted to many downstream tasks. The term was coined by Stanford HAI (Human-Centered AI Institute) in 2021.
Examples: GPT-4, Claude, Llama, Gemini, Stable Diffusion, DALL-E.
Key properties:
- **Trained at massive scale**: billions of parameters, trained on internet-scale data
- **General-purpose**: not designed for a single task
- **Adaptable**: can be specialized via fine-tuning or prompting
- **Emergent capabilities**: abilities that appear at scale but weren't explicitly trained for (e.g., chain-of-thought reasoning, in-context learning)
Foundation models represent a paradigm shift from task-specific to general-purpose AI. Instead of training a separate model for each task, a single foundation model serves as the base layer. [[Large Language Models (LLMs)]], image generators, and multimodal models are all built on this approach.
They are the core of [[Generative AI (Gen AI)]]. Their power comes from [[Deep Learning]] architectures trained at scale, and their practical utility comes from adaptation techniques like fine-tuning and prompting.
## References
- Term coined by Stanford HAI in the 2021 paper "On the Opportunities and Risks of Foundation Models"
## Related
- [[Large Language Models (LLMs)]]
- [[Generative AI (Gen AI)]]
- [[Deep Learning]]
- [[AI Fine-Tuning]]