# Georgi Gerganov Machine Learning (GGML)
GGML is the C/C++ tensor library that [[Georgi Gerganov]] built to run machine learning models efficiently on ordinary hardware, CPUs included. The name is just his initials plus "ML". It started as the engine behind [[llama.cpp]] and whisper.cpp, and for a while "GGML" also named the single-file model format those projects shipped, before [[GPT-Generated Unified Format (GGUF)|GGUF]] replaced it.
So GGML is really two things people tend to conflate: the library (still very much alive) and the old file format (retired).
## Why it matters
- **It made local inference practical**: no third-party dependencies, no memory allocations at runtime, and built-in integer [[AI Quantization|quantization]] to shrink models enough to fit on a laptop
- **It seeded the ecosystem**: [[llama.cpp]], [[Ollama]], and the rest of the local stack all stand on it. The whole [[AI Open Weight Models|open-weight]] scene you run yourself traces back here
- **The format part is history**: the old GGML files couldn't carry enough metadata and kept breaking compatibility between versions, which is exactly the problem [[GPT-Generated Unified Format (GGUF)|GGUF]] was designed to fix
## References
- https://github.com/ggml-org/ggml
## Related
- [[GPT-Generated Unified Format (GGUF)]]
- [[Safetensors]]
- [[Georgi Gerganov]]
- [[llama.cpp]]
- [[AI Open Weight Models]]
- [[Ollama]]
- [[AI Quantization]]
- [[Large Language Models (LLMs)]]