# NVIDIA AI-Q
NVIDIA AI-Q is an open-source blueprint for building enterprise-grade deep research [[AI Agents]], part of the [[NVIDIA Agent Toolkit]]. It is a [[LangGraph]]-based state machine that connects to enterprise data, reasons using state-of-the-art models, and delivers citation-backed business insights. Built on top of [[NVIDIA NeMo Agent Toolkit]].
## Architecture
Three core components forming a pipeline:
1. **Orchestrator**: classifies intent, sets research depth (shallow vs deep), coordinates the research loop, writes final report
2. **Planner** (2-phase): a Scout subagent maps the information landscape, then an Architect subagent designs a research plan with outline, queries, and quality constraints
3. **Researcher**: dispatches 5 parallel specialist subagents (Evidence Gatherer, Mechanism Explorer, Comparator, Critic, Horizon Scanner)
Key design: each subagent works within its own context window and returns only synthesized output. The orchestrator never sees raw search results, preventing noisy data from degrading reasoning.
## Hybrid model approach
Frontier models handle orchestration while [[NVIDIA Nemotron]] handles research tasks. This cuts query costs by 50%+ compared to frontier-only approaches. Default uses Nemotron 3 Nano 30B for agents and GPT-OSS-120B for deep research orchestration. Nemotron 3 Super 120B available as a higher-quality option.
## Performance
Ranked #1 on both DeepResearch Bench I and II leaderboards. Fine-tuned on ~67k trajectories (filtered from ~80k) using principle-based filtering. Training: 1 epoch on 128 NVIDIA H100 GPUs.
## Deployment
CLI, web UI (FastAPI + Next.js), Docker Compose, Kubernetes (Helm charts), and Jupyter notebooks. No local GPU required when using NVIDIA API Catalog.
## References
- https://build.nvidia.com/nvidia/aiq
- https://github.com/NVIDIA-AI-Blueprints/aiq
## Related
- [[NVIDIA Agent Toolkit]]
- [[NVIDIA NeMo Agent Toolkit]]
- [[NVIDIA Nemotron]]
- [[AI Agents]]
- [[AI Agent Orchestration]]
- [[Retrieval-Augmented Generation (RAG)]]
- [[LangGraph]]
- [[LangChain]]
- [[Large Language Models (LLMs)]]