Running Azure SRE Agent for AKS and Drasi Operations
I have been spending time with Azure SRE Agent and wanted to see how far I could take it beyond the "click around the portal" experience.
The goal was simple: build a public, repeatable blueprint that deploys an Azure SRE Agent for AKS and Drasi operations with:
- infrastructure deployed through Azure Developer CLI
- custom SRE subagents
- skills and runbooks
- Azure Monitor response plans
- scheduled health checks
- MCP connectors for Microsoft Learn and Drasi docs
- fault-injection tests for AKS and Drasi failure modes
The result is an Azure SRE Agent with support for Drasi on AKS that can be deployed with azd up using an AVM-style (Azure Verified Modules) Bicep module and PowerShell.
