Liquid Foundation Model 2 (LFM2)

# Liquid Foundation Model 2 (LFM2) LFM2 is a family of hybrid language models developed by Liquid AI, designed specifically for on-device deployment. Released on July 10, 2025, it sets a new standard in quality, speed, and memory efficiency for small models. ## Architecture LFM2 uses a hybrid design combining convolution and attention mechanisms, derived from Liquid Time-constant Networks (LTCs) and Linear Input-Varying (LIV) operators. The architecture was discovered using STAR, Liquid AI's neural architecture search engine. Each model uses 16 total blocks: 10 double-gated short-range convolution blocks and 6 grouped query attention (GQA) blocks. The design uses multiplicative gates and short convolutions (linear first-order systems). ## Model Sizes Three dense checkpoints: - **LFM2-350M**: 350M parameters - **LFM2-700M**: 700M parameters - **LFM2-1.2B**: 1.2B parameters A larger Mixture of Experts variant also exists: - **LFM2-24B-A2B**: 24B total parameters, only 2B active per token. Fits in 32 GB of RAM. Available on [[Ollama]]. All models support a 32K context window. ## Training - Trained on 10 trillion tokens (~75% English, 20% multilingual, 5% code) - Context length extended to 32K during pretraining - Uses knowledge distillation with LFM1-7B as teacher model - Post-training: large-scale supervised fine-tuning + custom Direct Preference Optimization with length normalization - 3x faster training compared to first-generation LFM ## Performance - 2x faster decode and prefill vs Qwen3 and Gemma 3 on CPU - LFM2-24B-A2B: 112 tokens/second on AMD CPU, 293 tokens/second on H100 GPU - LFM2-1.2B performs competitively with Qwen3-1.7B (47% larger) - LFM2-700M outperforms Gemma 3 1B IT - Benchmarked on MMLU, GPQA, IFEval, IFBench, GSM8K, MGSM, MMMLU - Multilingual support: Japanese, Arabic, Korean, Spanish, French, German ## Capabilities - Instruction following - Function calling (optimized for AI agents) - Multilingual support across 7 languages ## Deployment Runs on CPU, GPU, and NPU. Supported inference frameworks: - [[Ollama]] - [[LM Studio]] - llama.cpp - ExecuTorch (PyTorch ecosystem) Compatible with Qualcomm Snapdragon and AMD Ryzen processors. Target devices: smartphones, laptops, vehicles, robots, wearables. ## Licensing Apache 2.0-based [[Open Source]] license. Free for academic/research use and commercial use under $10M revenue. Licensing required above that threshold. ## References - https://www.liquid.ai/blog/liquid-foundation-models-v2-our-second-series-of-generative-ai-models - https://ollama.com/library/lfm2 ## Related - [[Ollama]] - [[LM Studio]] - [[Large Language Models (LLMs)]] - [[Artificial Intelligence (AI)]] - [[Machine Learning (ML)]] - [[Open Source]]