# Unsloth Unsloth is an open-source optimization framework for fine-tuning, training, and running [[Large Language Models (LLMs)]] locally with far less compute. Its headline claim: train and RL 500+ models up to 2x faster with up to 70% less VRAM and no accuracy loss. It comes in two parts: Unsloth Core (a code library) and Unsloth Studio (a web UI). ## What it does - **Fine-tuning and training**: full fine-tuning, RL, pretraining, in 4-bit, 16-bit, and FP8, powered by custom Triton and math kernels - **Long context RL**: claims 7x longer context than other setups via new batching; e.g. a 20B model with >500K context on an 80GB GPU - **MoE optimization**: train [[AI Mixture of Experts (MoE)]] models 12x faster with 35% less VRAM - **Inference and export**: download and run models (GGUF, LoRA adapters, safetensors); export to GGUF and 16-bit safetensors - **Dynamic quants**: well-known for high-quality low-bit GGUF quantizations of new [[AI Open Weight Models|open-weight]] models (the same UD-IQ2/TQ1 quants referenced for running models like [[GLM-5.2]] locally) ## Stack and ecosystem - Python (plus TypeScript for Studio); built on PyTorch, HuggingFace transformers/TRL, and llama.cpp - Supports Gemma 4, Qwen3.5/3.6, gpt-oss, DeepSeek, Llama 3.x, Mistral, Phi-4, embedding and TTS models - Works alongside [[Ollama]] and [[vLLM]] for serving - Dual-licensed: [[Apache 2.0 License|Apache 2.0]] (core) and AGPL-3.0 (Studio UI) ## References - https://unsloth.ai - https://unsloth.ai/docs - https://github.com/unslothai/unsloth ## Related - [[Large Language Models (LLMs)]] - [[AI Open Weight Models]] - [[AI Mixture of Experts (MoE)]] - [[Ollama]] - [[vLLM]] - [[GLM-5.2]] - [[Apache 2.0 License]] - [[Embeddings]]