# ONNX
Open Neural Network Exchange — an open standard for representing machine learning models. Defines a common file format and operator set so models trained in one framework (PyTorch, TensorFlow, scikit-learn, etc.) can be exported and run in any compatible runtime.
Homepage: https://onnx.ai/
## Why It Exists
Before ONNX, every ML framework had its own model format. Deploying a PyTorch-trained model required PyTorch at inference time. ONNX decouples training from inference: train once in your favorite framework, deploy anywhere via [[ONNX Runtime Web]] or other runtimes.
## What It Defines
- **File format**: `.onnx` files containing model architecture and weights as protobuf
- **Operator set (opset)**: standardized math ops (Conv, MatMul, Softmax, Attention, etc.); models declare which opset version they target
- **Type system**: tensor shapes, dtypes, dynamic dimensions
- **Metadata**: producer info, IR version, custom annotations
## Ecosystem
| Tool | Purpose |
|---|---|
| ONNX Runtime | Cross-platform inference engine (server, edge, mobile, browser) |
| [[ONNX Runtime Web]] | Browser/Node.js inference |
| Optimum | Hugging Face's exporter from `transformers` to ONNX |
| Netron | Visualizer for ONNX model graphs |
| onnxruntime-genai | Specialized runtime for generative LLMs |
## Trade-offs
**Strengths:**
- True framework portability
- Heavy optimization opportunities at the runtime layer
- Standard target for hardware vendors (vendors ship ONNX execution providers)
**Weaknesses:**
- Lags behind frameworks for very new operators
- Not all PyTorch/TF features exportable cleanly
- Model conversion can introduce bugs
## Relationship to Web ML
[[Transformers.js]], [[ONNX Runtime Web]], and many other browser ML libraries use ONNX as their model format. The W3C [[WebNN API]] could in principle execute ONNX graphs (and ONNX Runtime Web has a [[WebNN API]] execution provider in development).
## References
- https://onnx.ai/
- https://github.com/onnx/onnx
## Related
- [[GPT-Generated Unified Format (GGUF)]]
- [[Safetensors]]
- [[Georgi Gerganov Machine Learning (GGML)]]
- [[ONNX Runtime Web]]
- [[Transformers.js]]
- [[Machine Learning (ML)]]
- [[Neural Networks (NNs)]]
- [[AI Inference]]
- [[AI Quantization]]
- [[WebNN API]]
- [[On-Device Machine Learning]]