Module 2: Cloud Service Provider Controls

Incident Management for CSPs

18 min
+75 XP

Incident Management for CSPs

Overview

CSPs must have robust incident management processes to detect, respond to, and recover from security incidents.

Learning Objectives

  • Understand CSP incident response requirements
  • Implement incident detection mechanisms
  • Define escalation procedures
  • Manage customer communication
  • Apply ISO 27017 incident controls

ISO 27017 Incident Controls

A.16.1.1 - Responsibilities and Procedures

CSP Incident Response Framework:

┌─────────────────────────────────────┐
│    1. Detection & Identification    │
│    - SIEM monitoring                │
│    - Automated alerts               │
│    - Customer reports               │
├─────────────────────────────────────┤
│    2. Triage & Classification       │
│    - Severity assessment            │
│    - Impact analysis                │
│    - Assignment to team             │
├─────────────────────────────────────┤
│    3. Containment                   │
│    - Isolate affected systems       │
│    - Prevent spread                 │
│    - Preserve evidence              │
├─────────────────────────────────────┤
│    4. Eradication                   │
│    - Remove threat                  │
│    - Patch vulnerabilities          │
│    - Restore integrity              │
├─────────────────────────────────────┤
│    5. Recovery                      │
│    - Restore services               │
│    - Verify functionality           │
│    - Monitor for recurrence         │
├─────────────────────────────────────┤
│    6. Post-Incident Review          │
│    - Root cause analysis            │
│    - Lessons learned                │
│    - Process improvement            │
└─────────────────────────────────────┘

Incident Classification

Severity Levels

LevelCriteriaResponse TimeCustomer Notification
P1 - CriticalService down, data breach15 minImmediate
P2 - HighMajor degradation1 hour4 hours
P3 - MediumMinor impact4 hours24 hours
P4 - LowNo customer impact24 hoursAs needed

Customer Communication

Notification Template

SECURITY INCIDENT NOTIFICATION

Incident ID: INC-2024-001
Severity: P1 - Critical
Status: Investigating
Date Detected: 2024-01-15 14:30 UTC

SUMMARY:
We are investigating a potential unauthorized access
attempt to infrastructure in the US-East region.

CUSTOMER IMPACT:
- Services remain operational
- No evidence of data access
- Investigation ongoing

ACTIONS TAKEN:
- Isolated affected systems
- Enhanced monitoring activated
- Security team engaged

NEXT UPDATE:
Within 2 hours or sooner if status changes

CONTACT:
[email protected]

Incident Response Team Structure

┌──────────────────────────────────┐
│   Incident Commander             │
│   (Overall coordination)         │
└────────┬─────────────────────────┘
         │
    ┌────┴────────────┬──────────┐
    │                 │          │
┌───▼────┐  ┌────────▼───┐  ┌───▼────────┐
│Security│  │ Operations │  │ Communic.  │
│ Team   │  │   Team     │  │   Team     │
└───┬────┘  └────────┬───┘  └───┬────────┘
    │                │          │
    └────────┬───────┴──────────┘
             │
    ┌────────▼────────────┐
    │  Legal / Compliance │
    └─────────────────────┘

Detection Mechanisms

Automated Monitoring

Security Events:

  • Failed authentication attempts (threshold: 10/min)
  • Privilege escalation attempts
  • Unusual data access patterns
  • Configuration changes
  • Network anomalies

SIEM Integration:

// Example alert rule
{
  "rule": "Multiple failed logins",
  "condition": "failed_logins > 10 in 5 minutes",
  "action": "create_incident",
  "severity": "high",
  "notify": ["[email protected]"]
}

Evidence Collection

A.16.1.7 - Collection of Evidence

Forensic Procedures:

  1. Preserve logs and system state
  2. Create forensic images (if applicable)
  3. Document timeline
  4. Chain of custody
  5. Legal hold procedures

Post-Incident Activities

Root Cause Analysis

5 Whys Analysis Example

Incident: Unauthorized API access

Why 1: API key was compromised
Why 2: Key was committed to public GitHub repo
Why 3: Developer wasn't aware of best practices
Why 4: Security training was outdated
Why 5: Training program lacked cloud-specific content

ROOT CAUSE: Inadequate cloud security training

CORRECTIVE ACTIONS:
1. Update security training (immediate)
2. Implement secret scanning in CI/CD
3. Rotate all API keys
4. Conduct security awareness campaign

Key Takeaways

  1. Rapid detection and response are critical
  2. Clear severity classification guides response
  3. Customer communication must be timely
  4. Evidence collection supports investigation
  5. Post-incident review drives improvement
  6. Automation enhances detection capabilities

Self-Assessment

  1. What are the six phases of incident response?
  2. What is a P1 incident?
  3. When should customers be notified?
  4. What is the purpose of root cause analysis?
  5. What evidence should be collected?

Complete this lesson

Earn +75 XP and progress to the next lesson