# LLM Monitoring LLM monitoring covers the observability layer for production language-model applications: per-request traces, token usage, latency, cost, output quality signals, and drift detection. Distinct from traditional APM because the unit of interest is the generated text — not just the HTTP response; so monitoring also includes evaluation (factuality, refusal rate, hallucination flags, user feedback). OpenTelemetry-based tracing is becoming standard. Tools like [[MLflow]], [[LangSmith]], [[Langfuse]], and [[Helicone]] occupy this space. ## References ## Related - [[MLflow]] - [[LangSmith]] - [[Langfuse]] - [[Helicone]] - [[Edgee]] - [[MLOps]] - [[AI Observability]]