Incident Management for CSPs
Overview
CSPs must have robust incident management processes to detect, respond to, and recover from security incidents.
Learning Objectives
- Understand CSP incident response requirements
- Implement incident detection mechanisms
- Define escalation procedures
- Manage customer communication
- Apply ISO 27017 incident controls
ISO 27017 Incident Controls
A.16.1.1 - Responsibilities and Procedures
CSP Incident Response Framework:
┌─────────────────────────────────────┐
│ 1. Detection & Identification │
│ - SIEM monitoring │
│ - Automated alerts │
│ - Customer reports │
├─────────────────────────────────────┤
│ 2. Triage & Classification │
│ - Severity assessment │
│ - Impact analysis │
│ - Assignment to team │
├─────────────────────────────────────┤
│ 3. Containment │
│ - Isolate affected systems │
│ - Prevent spread │
│ - Preserve evidence │
├─────────────────────────────────────┤
│ 4. Eradication │
│ - Remove threat │
│ - Patch vulnerabilities │
│ - Restore integrity │
├─────────────────────────────────────┤
│ 5. Recovery │
│ - Restore services │
│ - Verify functionality │
│ - Monitor for recurrence │
├─────────────────────────────────────┤
│ 6. Post-Incident Review │
│ - Root cause analysis │
│ - Lessons learned │
│ - Process improvement │
└─────────────────────────────────────┘
Incident Classification
Severity Levels
| Level | Criteria | Response Time | Customer Notification |
|---|---|---|---|
| P1 - Critical | Service down, data breach | 15 min | Immediate |
| P2 - High | Major degradation | 1 hour | 4 hours |
| P3 - Medium | Minor impact | 4 hours | 24 hours |
| P4 - Low | No customer impact | 24 hours | As needed |
Customer Communication
Notification Template
SECURITY INCIDENT NOTIFICATION
Incident ID: INC-2024-001
Severity: P1 - Critical
Status: Investigating
Date Detected: 2024-01-15 14:30 UTC
SUMMARY:
We are investigating a potential unauthorized access
attempt to infrastructure in the US-East region.
CUSTOMER IMPACT:
- Services remain operational
- No evidence of data access
- Investigation ongoing
ACTIONS TAKEN:
- Isolated affected systems
- Enhanced monitoring activated
- Security team engaged
NEXT UPDATE:
Within 2 hours or sooner if status changes
CONTACT:
[email protected]
Incident Response Team Structure
┌──────────────────────────────────┐
│ Incident Commander │
│ (Overall coordination) │
└────────┬─────────────────────────┘
│
┌────┴────────────┬──────────┐
│ │ │
┌───▼────┐ ┌────────▼───┐ ┌───▼────────┐
│Security│ │ Operations │ │ Communic. │
│ Team │ │ Team │ │ Team │
└───┬────┘ └────────┬───┘ └───┬────────┘
│ │ │
└────────┬───────┴──────────┘
│
┌────────▼────────────┐
│ Legal / Compliance │
└─────────────────────┘
Detection Mechanisms
Automated Monitoring
Security Events:
- Failed authentication attempts (threshold: 10/min)
- Privilege escalation attempts
- Unusual data access patterns
- Configuration changes
- Network anomalies
SIEM Integration:
// Example alert rule
{
"rule": "Multiple failed logins",
"condition": "failed_logins > 10 in 5 minutes",
"action": "create_incident",
"severity": "high",
"notify": ["[email protected]"]
}
Evidence Collection
A.16.1.7 - Collection of Evidence
Forensic Procedures:
- Preserve logs and system state
- Create forensic images (if applicable)
- Document timeline
- Chain of custody
- Legal hold procedures
Post-Incident Activities
Root Cause Analysis
5 Whys Analysis Example
Incident: Unauthorized API access
Why 1: API key was compromised
Why 2: Key was committed to public GitHub repo
Why 3: Developer wasn't aware of best practices
Why 4: Security training was outdated
Why 5: Training program lacked cloud-specific content
ROOT CAUSE: Inadequate cloud security training
CORRECTIVE ACTIONS:
1. Update security training (immediate)
2. Implement secret scanning in CI/CD
3. Rotate all API keys
4. Conduct security awareness campaign
Key Takeaways
- Rapid detection and response are critical
- Clear severity classification guides response
- Customer communication must be timely
- Evidence collection supports investigation
- Post-incident review drives improvement
- Automation enhances detection capabilities
Self-Assessment
- What are the six phases of incident response?
- What is a P1 incident?
- When should customers be notified?
- What is the purpose of root cause analysis?
- What evidence should be collected?