# Liquid Foundation Model 2 (LFM2)
LFM2 is a family of hybrid language models developed by Liquid AI, designed specifically for on-device deployment. Released on July 10, 2025, it sets a new standard in quality, speed, and memory efficiency for small models.
## Architecture
LFM2 uses a hybrid design combining convolution and attention mechanisms, derived from Liquid Time-constant Networks (LTCs) and Linear Input-Varying (LIV) operators. The architecture was discovered using STAR, Liquid AI's neural architecture search engine.
Each model uses 16 total blocks: 10 double-gated short-range convolution blocks and 6 grouped query attention (GQA) blocks. The design uses multiplicative gates and short convolutions (linear first-order systems).
## Model Sizes
Three dense checkpoints:
- **LFM2-350M**: 350M parameters
- **LFM2-700M**: 700M parameters
- **LFM2-1.2B**: 1.2B parameters
A larger Mixture of Experts variant also exists:
- **LFM2-24B-A2B**: 24B total parameters, only 2B active per token. Fits in 32 GB of RAM. Available on [[Ollama]].
All models support a 32K context window.
## Training
- Trained on 10 trillion tokens (~75% English, 20% multilingual, 5% code)
- Context length extended to 32K during pretraining
- Uses knowledge distillation with LFM1-7B as teacher model
- Post-training: large-scale supervised fine-tuning + custom Direct Preference Optimization with length normalization
- 3x faster training compared to first-generation LFM
## Performance
- 2x faster decode and prefill vs Qwen3 and Gemma 3 on CPU
- LFM2-24B-A2B: 112 tokens/second on AMD CPU, 293 tokens/second on H100 GPU
- LFM2-1.2B performs competitively with Qwen3-1.7B (47% larger)
- LFM2-700M outperforms Gemma 3 1B IT
- Benchmarked on MMLU, GPQA, IFEval, IFBench, GSM8K, MGSM, MMMLU
- Multilingual support: Japanese, Arabic, Korean, Spanish, French, German
## Capabilities
- Instruction following
- Function calling (optimized for AI agents)
- Multilingual support across 7 languages
## Deployment
Runs on CPU, GPU, and NPU. Supported inference frameworks:
- [[Ollama]]
- [[LM Studio]]
- llama.cpp
- ExecuTorch (PyTorch ecosystem)
Compatible with Qualcomm Snapdragon and AMD Ryzen processors. Target devices: smartphones, laptops, vehicles, robots, wearables.
## Licensing
Apache 2.0-based [[Open Source]] license. Free for academic/research use and commercial use under $10M revenue. Licensing required above that threshold.
## References
- https://www.liquid.ai/blog/liquid-foundation-models-v2-our-second-series-of-generative-ai-models
- https://ollama.com/library/lfm2
## Related
- [[Ollama]]
- [[LM Studio]]
- [[Large Language Models (LLMs)]]
- [[Artificial Intelligence (AI)]]
- [[Machine Learning (ML)]]
- [[Open Source]]