# Pydantic Pydantic is a [[Python]] data-validation and settings-management library. Created by Samuel Colvin in 2017 and rewritten in Rust for v2 (2023), it has become the **de facto standard** for declaring typed data models in Python. The headline number you'll see quoted: Pydantic is downloaded **hundreds of millions of times per month** ; it's used by FastAPI, the AWS SDK (boto3-stubs), Hugging Face, OpenAI's SDK, and most modern Python LLM tooling. The core idea: declare data models with Python type hints, get runtime validation, JSON-Schema generation, and serialization for free. ```python from pydantic import BaseModel class User(BaseModel): name: str age: int email: str | None = None # Coerces, validates, raises descriptive errors User.model_validate({"name": "Alice", "age": "30"}) ``` That `"30"` (a string) becomes `30` (int) because Pydantic coerces with explicit rules; unparseable inputs raise structured `ValidationError` with field paths. ## Why Pydantic Matters For AI The AI tooling layer leans on Pydantic harder than almost any other Python library: - **Tool calling**: every major Python SDK ([[OpenAI SDK]], [[Anthropic SDK]], [[Google Gen AI SDK]], [[Mistral SDK]]) accepts Pydantic models as tool input schemas. The model auto-generates the JSON schema the LLM needs, and the runtime validates the LLM's response against the same model. One source of truth. - **Structured outputs**: pass a Pydantic model, the SDK forces the model to produce JSON matching that schema, returns a typed instance. - **[[Pydantic AI]]**: the agent framework built by the Pydantic team takes the same idea further ; agents, tools, dependencies, all typed end-to-end with Pydantic as the contract. - **MCP tool schemas**: many [[Model Context Protocol (MCP)]] server implementations use Pydantic to describe their tools. ## v1 vs v2 Pydantic v2 was a near-total rewrite with the validation core moved to Rust (`pydantic-core`). Two practical consequences: - **5-50× faster** validation on most workloads - **Strict mode by default** semantics changed; some v1 patterns need migration - **`BaseModel.dict()` → `model_dump()`**, `parse_obj` → `model_validate`, etc. If you see code with `class Config:` inside a `BaseModel`, that's v1; v2 uses `model_config = ConfigDict(...)`. ## Beyond Models Pydantic is more than `BaseModel`: - **`Field`** ; constraints (`gt`, `lt`, `min_length`, regex, default factories), aliases, examples - **Validators** ; `@field_validator`, `@model_validator` for custom logic - **`TypeAdapter`** ; validate arbitrary types without wrapping in a `BaseModel` - **Pydantic Settings** ; `BaseSettings` for env-var-driven config with type coercion (the lib formerly known as `pydantic-settings`) - **JSON Schema generation** ; `Model.model_json_schema()` returns a spec-compliant JSON Schema, ready to hand to an LLM ## Adoption In The Stack - **FastAPI** ; the request/response model layer is Pydantic end-to-end - **LangChain, LlamaIndex, [[Mastra AI]]'s Python bridge, [[OpenAI Agents SDK]], [[Pydantic AI]]** ; tool schemas, output parsers, configuration - **Many CLI / config-file tools** ; argparse + JSON / TOML loading via `BaseSettings` ## Trade-offs - **Coercion can hide bugs** ; the default behaviour of "string `"30"` → int `30`" is great for HTTP boundaries, dangerous for internal-only models. Use `model_config = ConfigDict(strict=True)` when you don't want it. - **Rust core means a binary wheel** ; usually invisible, occasionally relevant on exotic targets - **v1 → v2 migration friction** is real if you maintain old code ## License MIT. Source at https://github.com/pydantic/pydantic. ## References - Documentation: https://docs.pydantic.dev/ - GitHub: https://github.com/pydantic/pydantic - v1 → v2 migration guide: https://docs.pydantic.dev/latest/migration/ ## Related - [[Pydantic AI]] - [[Python]] - [[OpenAI SDK]] - [[Anthropic SDK]] - [[Google Gen AI SDK]] - [[Mistral SDK]] - [[Model Context Protocol (MCP)]]