Incident Response Playbook

intermediateopsMin 32K context

Generates and assists with incident response procedures for production systems. Helps with root cause analysis, creates runbooks for common failure modes, builds communication templates for stakeholders, and produces post-incident review documents. Supports SRE and on-call workflows.

Use Cases

  • Creating incident runbooks for common failure scenarios
  • Generating postmortem documents from incident timelines
  • Building communication templates for status page updates
  • Root cause analysis from error logs and metrics
  • Designing on-call escalation procedures and rotation policies

Example Prompt

Generate an incident response runbook for the following scenario.

System: E-commerce platform
Incident type: Database connection pool exhaustion

Symptoms:
- API latency spikes above 5s
- 503 errors increasing on checkout endpoints
- PostgreSQL connection count at max (200/200)
- Worker threads waiting on DB connections

Please create:
1. Severity classification criteria
2. First responder checklist (first 5 minutes)
3. Diagnosis steps with specific commands/queries
4. Mitigation options (quick fixes vs proper resolution)
5. Rollback procedures if applicable
6. Communication templates (internal + external)
7. Prevention measures and long-term fixes
8. Post-incident review template

Recommended Models

Compatible Tools

claude-codecursorkiroany

Modalities

Input: text, code
Output: text

Related Skills

Author

OpenModels Community

@openmodelsrun
Incident Response Playbook — AI Agent Skill | OpenModels