SYSTEM_CONSOLE v2.4.0

Security & Governance

LAST_UPDATED: 2025-10

Security and governance are not optional add-ons. They must be built into the platform from the start: data classified at ingestion, access enforced by IAM policy, and every sensitive field masked before it lands in the lake. Retrofitting any of this after the fact is orders of magnitude harder.

Key Takeaways

  • 01 Least privilege access via group-based identities.
  • 02 Mandatory data classification tagging.
  • 03 Encryption at rest and in transit as a global standard.
  • 04 Federated governance: Standards set centrally, enforced locally.

Checklist

  • Access model defined and documented per data product.
  • Classification tags applied to all datasets and columns.
  • Audit logs enabled and retained according to policy.
  • Service accounts used for all pipeline operations.

Identity and access

Access is managed at the domain level, following the principle of least privilege. Group-based identities simplify management and mean permissions are tied to roles, not individuals. When someone changes teams, one group change handles the revocation.

Service Identities

Pipelines use dedicated service accounts, never personal credentials.

Row-Level Security

Restrict data access based on user attributes (e.g., country code).

Just-In-Time Access

Elevated permissions are granted only when needed for debugging.

Data classification

Level Definition Example
PII / Sensitive Identifiable personal info. Requires strict masking. Email, Home Address
Restricted Business-sensitive data. Needs 'need-to-know'. Profit Margins, Vendor Contracts
Public Safe for all employees to view. Product Catalog, Store Locations

Sensitive data processing

When processing highly sensitive data, extra isolation is required. Three mechanisms apply here:

  • Tokenization: Replacing sensitive values with non-sensitive tokens before they land in the lake.
  • Confidential Computing: Processing data in encrypted memory enclaves (where available).
  • Audit Logging: Every access to sensitive data is logged and periodically reviewed.
GCP mapping
IAM (Access Control), Cloud KMS (Encryption Keys), Cloud DLP (Data Discovery/Masking), Dataplex (Governance & Catalog), VPC Service Controls (Network Isolation).

Failure modes

  • ! "Everyone gets access": To avoid friction, teams grant wide permissions, leading to data leaks.
  • ! No Audit Trail: A data breach occurs, and there is no record of who accessed the data or when.
  • ! PII Leak: Raw personal data accidentally lands in a 'Public' Gold table because it wasn't classified at source.
  • ! Key Loss: Encryption keys are managed poorly, leading to permanent data loss during a region failover.