# LLM Monitoring
LLM monitoring covers the observability layer for production language-model applications: per-request traces, token usage, latency, cost, output quality signals, and drift detection. Distinct from traditional APM because the unit of interest is the generated text — not just the HTTP response; so monitoring also includes evaluation (factuality, refusal rate, hallucination flags, user feedback).
OpenTelemetry-based tracing is becoming standard. Tools like [[MLflow]], [[LangSmith]], [[Langfuse]], and [[Helicone]] occupy this space.
## References
## Related
- [[MLflow]]
- [[LangSmith]]
- [[Langfuse]]
- [[Helicone]]
- [[Edgee]]
- [[MLOps]]
- [[AI Observability]]