Data Classification in Cloud
Overview
Data classification is fundamental to protecting information in cloud environments, enabling appropriate security controls.
Learning Objectives
- Understand data classification principles
- Implement classification schemes
- Apply appropriate controls by classification
- Manage classified data in cloud
- Apply ISO 27017 data classification guidance
ISO 27017 Control A.8.2.1
Classification of Information
Requirements:
- Information shall be classified
- Classification reflects importance and sensitivity
- Appropriate protection applied
- Labels and handling procedures defined
Classification Levels
Standard Classification Scheme
| Level | Definition | Examples | Protection |
|---|---|---|---|
| Public | Can be freely disclosed | Marketing materials, public website | Minimal |
| Internal | Internal use only | Policies, internal memos | Basic access control |
| Confidential | Sensitive business info | Financial data, contracts | Encryption, restricted access |
| Restricted | Highly sensitive | PII, trade secrets, health data | Strong encryption, strict access, logging |
Alternative Schemes
By Regulatory Requirement:
- PHI (HIPAA Protected Health Information)
- PII (Personally Identifiable Information)
- PCI (Payment Card Information)
- Export-Controlled
- Attorney-Client Privileged
By Business Impact:
- Critical (business-threatening if disclosed)
- High (significant business impact)
- Medium (moderate impact)
- Low (minimal impact)
Classification Process
Step-by-Step Approach
1. Inventory Data
├─ Identify all data sources
├─ Document data types
└─ Map data locations
2. Assign Classification
├─ Apply classification criteria
├─ Involve data owners
└─ Document classifications
3. Label Data
├─ Apply metadata tags
├─ Use naming conventions
└─ Implement visual labels (if applicable)
4. Implement Controls
├─ Apply encryption
├─ Configure access controls
├─ Enable logging
└─ Set retention policies
5. Monitor and Review
├─ Regular classification reviews
├─ Handle reclassification
└─ Audit compliance
Cloud-Specific Classification
Metadata Tagging
// Example: AWS S3 Object Tags
{
"TagSet": [
{
"Key": "DataClassification",
"Value": "Restricted"
},
{
"Key": "DataOwner",
"Value": "finance-team"
},
{
"Key": "RetentionPeriod",
"Value": "7years"
},
{
"Key": "ComplianceRequirement",
"Value": "PCI-DSS"
}
]
}
Database Classification
-- Example: Column-Level Classification
CREATE TABLE customers (
customer_id INT,
email VARCHAR(255), -- PII/Confidential
phone VARCHAR(20), -- PII/Confidential
ssn VARCHAR(11), -- PII/Restricted - Encrypt
credit_card VARCHAR(16), -- PCI/Restricted - Tokenize
address TEXT, -- PII/Confidential
created_at TIMESTAMP -- Internal
);
-- Apply encryption/masking based on classification
Controls by Classification
Protection Requirements Matrix
| Classification | Encryption | Access | Monitoring | Retention | Backup |
|---|---|---|---|---|---|
| Public | Optional | Open | Basic | Standard | Standard |
| Internal | In transit | Authenticated users | Standard | Standard | Standard |
| Confidential | At rest + transit | Authorized users | Enhanced | Extended | Encrypted |
| Restricted | Strong (at rest + transit + client-side) | Need-to-know, MFA | Comprehensive | Regulatory-driven | Encrypted, tested |
Encryption Requirements
Public: No specific requirement
Internal: TLS 1.2+ in transit
Confidential:
├─ AES-256 at rest
└─ TLS 1.2+ in transit
Restricted:
├─ AES-256 at rest (customer-managed keys)
├─ TLS 1.3 in transit
└─ Client-side encryption for highest sensitivity
Access Control by Classification
RBAC Implementation
Public Data
└─ All users (read)
Internal Data
├─ All employees (read)
└─ Data owner (write)
Confidential Data
├─ Department members (read)
├─ Department managers (read/write)
└─ Data owner (full control)
Restricted Data
├─ Named individuals only (read)
├─ Data owner (read/write)
├─ Requires MFA
├─ Requires justification/approval
└─ Time-limited access
Data Loss Prevention (DLP)
DLP Policies by Classification
Policy: Prevent Restricted Data Sharing
Conditions:
├─ Data classification = Restricted
├─ Action = Email send OR File upload
└─ Destination = External
Actions:
├─ Block transmission
├─ Alert security team
├─ Log incident
└─ Notify user
Cloud DLP Implementation
SaaS (Microsoft 365, Google Workspace):
- Built-in DLP policies
- Pattern matching (SSN, credit cards)
- Custom sensitive info types
- Automatic classification
IaaS/PaaS:
- CASB for DLP
- Network DLP appliances
- Application-level DLP
- API-based scanning
Handling and Storage
A.8.2.3 - Handling of Assets
By Classification Level:
Restricted Data Handling Rules:
├─ Storage
│ ├─ Encrypted volumes only
│ ├─ Customer-managed keys
│ ├─ Specific geographic regions
│ └─ No personal devices
│
├─ Transmission
│ ├─ Encrypted channels (TLS 1.3)
│ ├─ Approved services only
│ ├─ No public email
│ └─ Logged transfers
│
├─ Processing
│ ├─ Isolated environments
│ ├─ Approved applications
│ ├─ Audit logging enabled
│ └─ Screen privacy filters
│
└─ Disposal
├─ Secure deletion (multi-pass)
├─ Verify deletion
├─ Certificate of destruction
└─ Retention period compliance
Practical Implementation
Classification Workflow in Cloud
Document Upload to Cloud Storage
1. User uploads file
↓
2. Automatic scanning
├─ Content inspection
├─ Pattern matching (SSN, CC#)
└─ ML-based classification
↓
3. Classification assigned
├─ User confirmation (if needed)
└─ Metadata tagging
↓
4. Controls applied
├─ Encryption (based on classification)
├─ Access control (based on classification)
├─ DLP policies (based on classification)
└─ Logging (based on classification)
↓
5. Monitoring
├─ Access monitoring
├─ Sharing monitoring
└─ Usage analytics
Example: Cloud Storage Bucket Policy
// AWS S3 Bucket Policy - Restricted Data
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Deny",
"Principal": "*",
"Action": "s3:*",
"Resource": "arn:aws:s3:::restricted-data-bucket/*",
"Condition": {
"Bool": {
"aws:SecureTransport": "false"
}
}
},
{
"Effect": "Deny",
"Principal": "*",
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::restricted-data-bucket/*",
"Condition": {
"StringNotEquals": {
"s3:x-amz-server-side-encryption": "aws:kms"
}
}
}
]
}
Key Takeaways
- Classification drives appropriate protection
- Cloud enables automated classification
- Metadata tagging is essential
- Controls must match classification
- Regular review maintains accuracy
- DLP enforces classification policies
Self-Assessment
- What are the four standard classification levels?
- Why is data classification important?
- How can data be classified in cloud storage?
- What controls apply to Restricted data?
- What is DLP and how does it relate to classification?