# Georgi Gerganov Machine Learning (GGML) GGML is the C/C++ tensor library that [[Georgi Gerganov]] built to run machine learning models efficiently on ordinary hardware, CPUs included. The name is just his initials plus "ML". It started as the engine behind [[llama.cpp]] and whisper.cpp, and for a while "GGML" also named the single-file model format those projects shipped, before [[GPT-Generated Unified Format (GGUF)|GGUF]] replaced it. So GGML is really two things people tend to conflate: the library (still very much alive) and the old file format (retired). ## Why it matters - **It made local inference practical**: no third-party dependencies, no memory allocations at runtime, and built-in integer [[AI Quantization|quantization]] to shrink models enough to fit on a laptop - **It seeded the ecosystem**: [[llama.cpp]], [[Ollama]], and the rest of the local stack all stand on it. The whole [[AI Open Weight Models|open-weight]] scene you run yourself traces back here - **The format part is history**: the old GGML files couldn't carry enough metadata and kept breaking compatibility between versions, which is exactly the problem [[GPT-Generated Unified Format (GGUF)|GGUF]] was designed to fix ## References - https://github.com/ggml-org/ggml ## Related - [[GPT-Generated Unified Format (GGUF)]] - [[Safetensors]] - [[Georgi Gerganov]] - [[llama.cpp]] - [[AI Open Weight Models]] - [[Ollama]] - [[AI Quantization]] - [[Large Language Models (LLMs)]]