# Mistral Large 3 State-of-the-art open-weight general-purpose multimodal model by [[Mistral AI]], released December 2, 2025. First Mistral mixture-of-experts model since the Mixtral series. Released under the [[Apache 2.0 License]] (both base and instruct versions). ## Architecture - Sparse Mixture of Experts (MoE): 41B active parameters, 675B total parameters - 256k context window - Multimodal: text and image inputs - Trained from scratch on 3,000 NVIDIA H200 GPUs ## Performance - Debuted at #2 in OSS non-reasoning models (#6 overall) on the LMArena leaderboard - Parity with the best instruction-tuned open-weight models on general prompts - Best-in-class performance on multilingual conversations (non-English/Chinese) - Image understanding capabilities ## Deployment - Available in NVFP4 compressed format (built with llm-compressor) - Runs on a single 8xA100 or 8xH100 node via vLLM - Optimized for TensorRT-LLM and SGLang - Support for prefill/decode disaggregated serving and speculative decoding on GB200 NVL72 - API pricing: $0.5/M input tokens, $1.5/M output tokens ## Features - Chat completions, function calling, agents - Structured outputs, predicted outputs - OCR, document QnA, bounding box extraction - Fill-in-the-middle (FIM), embeddings - Audio transcription, moderation - Batch inference ## Availability - Mistral AI Studio - Amazon Bedrock, Azure Foundry - Hugging Face - OpenRouter, Fireworks, Together AI, Modal, IBM WatsonX ## References - https://mistral.ai/news/mistral-3 - https://docs.mistral.ai/models/mistral-large-3-25-12 - https://huggingface.co/collections/mistralai/mistral-large-3 ## Related - [[Mistral AI]] - [[Large Language Models (LLMs)]] - [[Mistral Small 4]] - [[Mistral OCR]] - [[Apache 2.0 License]]