# Unsloth
Unsloth is an open-source optimization framework for fine-tuning, training, and running [[Large Language Models (LLMs)]] locally with far less compute. Its headline claim: train and RL 500+ models up to 2x faster with up to 70% less VRAM and no accuracy loss. It comes in two parts: Unsloth Core (a code library) and Unsloth Studio (a web UI).
## What it does
- **Fine-tuning and training**: full fine-tuning, RL, pretraining, in 4-bit, 16-bit, and FP8, powered by custom Triton and math kernels
- **Long context RL**: claims 7x longer context than other setups via new batching; e.g. a 20B model with >500K context on an 80GB GPU
- **MoE optimization**: train [[AI Mixture of Experts (MoE)]] models 12x faster with 35% less VRAM
- **Inference and export**: download and run models (GGUF, LoRA adapters, safetensors); export to GGUF and 16-bit safetensors
- **Dynamic quants**: well-known for high-quality low-bit GGUF quantizations of new [[AI Open Weight Models|open-weight]] models (the same UD-IQ2/TQ1 quants referenced for running models like [[GLM-5.2]] locally)
## Stack and ecosystem
- Python (plus TypeScript for Studio); built on PyTorch, HuggingFace transformers/TRL, and llama.cpp
- Supports Gemma 4, Qwen3.5/3.6, gpt-oss, DeepSeek, Llama 3.x, Mistral, Phi-4, embedding and TTS models
- Works alongside [[Ollama]] and [[vLLM]] for serving
- Dual-licensed: [[Apache 2.0 License|Apache 2.0]] (core) and AGPL-3.0 (Studio UI)
## References
- https://unsloth.ai
- https://unsloth.ai/docs
- https://github.com/unslothai/unsloth
## Related
- [[Large Language Models (LLMs)]]
- [[AI Open Weight Models]]
- [[AI Mixture of Experts (MoE)]]
- [[Ollama]]
- [[vLLM]]
- [[GLM-5.2]]
- [[Apache 2.0 License]]
- [[Embeddings]]