# Model routing
Model routing is the practice of directing different tasks to different AI models based on the task's requirements. Instead of using one model for everything, a routing layer selects the most appropriate model considering capability, cost, speed, and context needs.
This is the [[Receptionist AI Design Pattern]] applied at the model level. A simple classification (or even a smaller, faster model) decides whether a task needs a powerful reasoning model like [[Claude|Claude Opus]] or can be handled by a faster, cheaper model like Haiku. [[AI Subagents]] naturally implement this: the parent agent uses a premium model while subagents can run on lighter models for routine tasks like code review or file exploration.
Routing criteria:
- **Task complexity**: simple lookups vs. multi-step reasoning
- **Latency requirements**: real-time responses vs. background processing
- **Cost sensitivity**: high-volume tasks benefit from cheaper models
- **Context length**: some tasks need large [[Context Window|context windows]], others don't
- **Specialization**: some models excel at code, others at creative writing or analysis
[[OpenRouter]] and [[AI Gateway]] solutions provide infrastructure for model routing, offering a unified API across multiple providers with automatic fallback, load balancing, and cost optimization.
The trade-off is routing accuracy. Misrouting a complex task to a cheap model produces bad output. Misrouting a simple task to an expensive model wastes money. Getting this right requires [[AI Observability]] to track quality per model per task type.
## References
-
## Related
- [[Receptionist AI Design Pattern]]
- [[AI Subagents]]
- [[AI Agent Orchestration]]
- [[AI Gateway]]
- [[OpenRouter]]
- [[Large Language Models (LLMs)]]
- [[Context Window]]
- [[AI Observability]]