# ONNX Open Neural Network Exchange — an open standard for representing machine learning models. Defines a common file format and operator set so models trained in one framework (PyTorch, TensorFlow, scikit-learn, etc.) can be exported and run in any compatible runtime. Homepage: https://onnx.ai/ ## Why It Exists Before ONNX, every ML framework had its own model format. Deploying a PyTorch-trained model required PyTorch at inference time. ONNX decouples training from inference: train once in your favorite framework, deploy anywhere via [[ONNX Runtime Web]] or other runtimes. ## What It Defines - **File format**: `.onnx` files containing model architecture and weights as protobuf - **Operator set (opset)**: standardized math ops (Conv, MatMul, Softmax, Attention, etc.); models declare which opset version they target - **Type system**: tensor shapes, dtypes, dynamic dimensions - **Metadata**: producer info, IR version, custom annotations ## Ecosystem | Tool | Purpose | |---|---| | ONNX Runtime | Cross-platform inference engine (server, edge, mobile, browser) | | [[ONNX Runtime Web]] | Browser/Node.js inference | | Optimum | Hugging Face's exporter from `transformers` to ONNX | | Netron | Visualizer for ONNX model graphs | | onnxruntime-genai | Specialized runtime for generative LLMs | ## Trade-offs **Strengths:** - True framework portability - Heavy optimization opportunities at the runtime layer - Standard target for hardware vendors (vendors ship ONNX execution providers) **Weaknesses:** - Lags behind frameworks for very new operators - Not all PyTorch/TF features exportable cleanly - Model conversion can introduce bugs ## Relationship to Web ML [[Transformers.js]], [[ONNX Runtime Web]], and many other browser ML libraries use ONNX as their model format. The W3C [[WebNN API]] could in principle execute ONNX graphs (and ONNX Runtime Web has a [[WebNN API]] execution provider in development). ## References - https://onnx.ai/ - https://github.com/onnx/onnx ## Related - [[GPT-Generated Unified Format (GGUF)]] - [[Safetensors]] - [[Georgi Gerganov Machine Learning (GGML)]] - [[ONNX Runtime Web]] - [[Transformers.js]] - [[Machine Learning (ML)]] - [[Neural Networks (NNs)]] - [[AI Inference]] - [[AI Quantization]] - [[WebNN API]] - [[On-Device Machine Learning]]