# Mistral Large 3
State-of-the-art open-weight general-purpose multimodal model by [[Mistral AI]], released December 2, 2025. First Mistral mixture-of-experts model since the Mixtral series. Released under the [[Apache 2.0 License]] (both base and instruct versions).
## Architecture
- Sparse Mixture of Experts (MoE): 41B active parameters, 675B total parameters
- 256k context window
- Multimodal: text and image inputs
- Trained from scratch on 3,000 NVIDIA H200 GPUs
## Performance
- Debuted at #2 in OSS non-reasoning models (#6 overall) on the LMArena leaderboard
- Parity with the best instruction-tuned open-weight models on general prompts
- Best-in-class performance on multilingual conversations (non-English/Chinese)
- Image understanding capabilities
## Deployment
- Available in NVFP4 compressed format (built with llm-compressor)
- Runs on a single 8xA100 or 8xH100 node via vLLM
- Optimized for TensorRT-LLM and SGLang
- Support for prefill/decode disaggregated serving and speculative decoding on GB200 NVL72
- API pricing: $0.5/M input tokens, $1.5/M output tokens
## Features
- Chat completions, function calling, agents
- Structured outputs, predicted outputs
- OCR, document QnA, bounding box extraction
- Fill-in-the-middle (FIM), embeddings
- Audio transcription, moderation
- Batch inference
## Availability
- Mistral AI Studio
- Amazon Bedrock, Azure Foundry
- Hugging Face
- OpenRouter, Fireworks, Together AI, Modal, IBM WatsonX
## References
- https://mistral.ai/news/mistral-3
- https://docs.mistral.ai/models/mistral-large-3-25-12
- https://huggingface.co/collections/mistralai/mistral-large-3
## Related
- [[Mistral AI]]
- [[Large Language Models (LLMs)]]
- [[Mistral Small 4]]
- [[Mistral OCR]]
- [[Apache 2.0 License]]