# OpenSRE OpenSRE is an open-source ([[Apache 2.0 License]]) framework from Tracer Cloud for building [[AI Agents|AI agents]] that investigate and respond to production incidents. The problem it tackles: when something breaks in production, the evidence is scattered across logs, metrics, traces, runbooks, and Slack threads. OpenSRE wires those together so an agent can do the first pass of incident response (the site-reliability-engineering work) instead of a human paging through dashboards at 3am. ## What it does - Fetches alert context automatically, then correlates logs, metrics, and traces across systems - Reasons across signals to spot anomalies and do root-cause analysis with evidence linking - Generates investigation reports and suggests remediation - Integrates with 60+ tools across observability, cloud, and incident management; delivers via Slack and PagerDuty - Ships with end-to-end tests and synthetic incident simulations for realistic failure scenarios ## Stack Python with FastAPI; supports multiple LLM providers ([[Anthropic]], [[OpenAI]], [[Ollama]], Gemini). Currently in public alpha. ## References - https://www.opensre.com - https://www.opensre.com/docs - https://www.opensre.com/docs/quickstart - https://github.com/Tracer-Cloud/opensre ## Related - [[AI Agents]] - [[Agentic Engineering]] - [[Large Language Models (LLMs)]] - [[Model Context Protocol (MCP)]] - [[Apache 2.0 License]]