NVIDIA AI-Q - DeveloPassion

# NVIDIA AI-Q NVIDIA AI-Q is an open-source blueprint for building enterprise-grade deep research [[AI Agents]], part of the [[NVIDIA Agent Toolkit]]. It is a [[LangGraph]]-based state machine that connects to enterprise data, reasons using state-of-the-art models, and delivers citation-backed business insights. Built on top of [[NVIDIA NeMo Agent Toolkit]]. ## Architecture Three core components forming a pipeline: 1. **Orchestrator**: classifies intent, sets research depth (shallow vs deep), coordinates the research loop, writes final report 2. **Planner** (2-phase): a Scout subagent maps the information landscape, then an Architect subagent designs a research plan with outline, queries, and quality constraints 3. **Researcher**: dispatches 5 parallel specialist subagents (Evidence Gatherer, Mechanism Explorer, Comparator, Critic, Horizon Scanner) Key design: each subagent works within its own context window and returns only synthesized output. The orchestrator never sees raw search results, preventing noisy data from degrading reasoning. ## Hybrid model approach Frontier models handle orchestration while [[NVIDIA Nemotron]] handles research tasks. This cuts query costs by 50%+ compared to frontier-only approaches. Default uses Nemotron 3 Nano 30B for agents and GPT-OSS-120B for deep research orchestration. Nemotron 3 Super 120B available as a higher-quality option. ## Performance Ranked #1 on both DeepResearch Bench I and II leaderboards. Fine-tuned on ~67k trajectories (filtered from ~80k) using principle-based filtering. Training: 1 epoch on 128 NVIDIA H100 GPUs. ## Deployment CLI, web UI (FastAPI + Next.js), Docker Compose, Kubernetes (Helm charts), and Jupyter notebooks. No local GPU required when using NVIDIA API Catalog. ## References - https://build.nvidia.com/nvidia/aiq - https://github.com/NVIDIA-AI-Blueprints/aiq ## Related - [[NVIDIA Agent Toolkit]] - [[NVIDIA NeMo Agent Toolkit]] - [[NVIDIA Nemotron]] - [[AI Agents]] - [[AI Agent Orchestration]] - [[Retrieval-Augmented Generation (RAG)]] - [[LangGraph]] - [[LangChain]] - [[Large Language Models (LLMs)]]