Global Multi-Cloud Governance — A Reference Architecture for Azure, AWS, and GCP

A reference architecture for operating globally distributed infrastructure across Azure, AWS, and GCP — with compliance built into the platform layer, not bolted on after the fact.

Motivation

Organizations at global scale face compounding challenges: regulatory fragmentation across jurisdictions, provider concentration risk, M&A integration pressure, and best-of-breed requirements that no single provider satisfies. The cost of not standardizing is inconsistent security posture, audit fatigue, and ungoverned shadow IT.

This RFC proposes a unified governance model built on seven principles:

#	Principle	Rationale
P1	Primary + secondary model	One cloud is the default; others justified by workload
P2	Policy as code	Governance rules are version-controlled and auto-enforced
P3	Identity is the perimeter	Zero-trust, identity-centric security across all providers
P4	Data sovereignty by design	Residency constraints encoded in the platform
P5	Automate everything	No manual provisioning, no ClickOps
P6	Least privilege, just-in-time	No standing privileged access
P7	Centralize observe, decentralize execute	Central visibility; federated workload ownership

Organizational Structure

Each cloud has its own hierarchy model, but the concept is identical: platform resources separated from landing zones, with sandbox and quarantine boundaries. The separation matters because platform teams and workload teams operate at different cadences — platform changes are slow, deliberate, and high-blast-radius. Landing zone changes are fast and scoped.

Azure — Management Group Hierarchy

Platform — Identity (Entra ID), Management (Sentinel, Log Analytics), Connectivity (Hub vNets, ExpressRoute)
Landing Zones — Corp, Online, Regulated (PCI/HIPAA/FedRAMP), Confidential (sovereign)
Sandbox — Experimentation, no prod connectivity
Quarantine — Non-compliant subs auto-moved here

AWS — Organization OU Structure

Security OU — Log Archive, Security Tooling (GuardDuty, Security Hub), Audit
Infrastructure OU — Network Hub (Transit Gateway), Shared Services
Workloads OU — Corp / Online / Regulated with Prod, Staging, Dev per OU
Sandbox / Quarantine — Deny-all SCP

GCP — Resource Hierarchy

Platform — Networking (Shared VPC), Logging (centralized sink), Security (SCC, KMS)
Landing Zones — Corp, Analytics (BigQuery, Vertex AI), Regulated (Assured Workloads)
Sandbox / Quarantine — Deny-all org policy

Cross-Cloud Mapping

Despite the naming differences, every concept maps 1:1 across providers. This is what makes a unified governance model possible.

Concept	Azure	AWS	GCP
Top-level container	Management Group	Organizational Unit	Folder
Billing boundary	Subscription	Account	Project
Policy engine	Azure Policy	SCPs	Org Policies
Identity	Entra ID	IAM Identity Center	Cloud Identity
Network hub	Hub vNet	Transit Gateway	Shared VPC

Use subscriptions/accounts/projects as the unit of scale. One per workload per environment.

Identity & Access

Identity is the most critical layer. Get it wrong and every other control is compromised. The approach here is a single Identity Provider (Entra ID) federated into all three clouds — authentication is centralized, authorization is per-cloud. This avoids the alternative of managing three separate identity systems with inevitable configuration drift.

Layer	Mechanism
Authentication	Entra ID with phishing-resistant MFA (FIDO2 / passkeys)
Authorization	RBAC via group-to-role mappings, per cloud, per scope
Privileged access	PIM (Azure) / temporary elevated access (AWS, GCP)
Service-to-service	Workload Identity Federation — no long-lived keys
Break-glass	Sealed emergency accounts, hardware tokens in safe

Long-lived service account keys are prohibited. Workload Identity Federation for all service-to-service and CI/CD authentication eliminates the largest class of credential exposure.

Networking

VPN overlays between clouds are fragile, bandwidth-limited, and hard to troubleshoot at scale. Cross-cloud connectivity instead runs through a colocation fabric (Megaport or Equinix) — dedicated, high-throughput, and provider-neutral. Each cloud has a hub network that peers to landing zone spokes.

IP space is pre-allocated across all four domains to prevent overlap, which is the single most painful networking problem to fix retroactively:

Provider	CIDR	Range
Azure	`10.0.0.0/10`	10.0 – 10.63
AWS	`10.64.0.0/10`	10.64 – 10.127
GCP	`10.128.0.0/10`	10.128 – 10.191
On-prem	`10.192.0.0/10`	10.192 – 10.255

Within each /10: /14 per region, /16 per environment, /20 per landing zone.

Sovereign regions (China, GovCloud) are isolated by design — no cross-cloud mesh. Fully separate identity and networking.

Compliance

Running workloads across three clouds means satisfying overlapping regulatory frameworks simultaneously. The key insight is that compliance at this scale cannot rely on manual review — it has to be enforced through preventive controls (deny non-compliant actions before they happen) and detective controls (alert on drift after the fact).

Framework	Scope
SOC 2 Type II	All production workloads
ISO 27001	Entire organization
PCI DSS v4.0	Payment processing (regulated zones)
HIPAA	Healthcare data (regulated zones)
FedRAMP High	US government (gov regions)
GDPR	EU personal data (residency enforced)

Preventive Controls

Each cloud has its own policy engine, but the rules are equivalent. The same intent — "deny unapproved regions" — is expressed differently in each provider:

Control	Azure	AWS	GCP
Deny unapproved regions	`allowedLocations`	`aws:RequestedRegion`	`gcp.resourceLocations`
Require encryption at rest	Policy deny	Config Rule	Org policy
Deny public storage	Deny public blob	`s3:PutBucketPublicAccessBlock`	`uniformBucketLevelAccess`
Enforce TLS 1.2+	`MinimumTlsVersion`	Config Rule	Org policy
Deny long-lived keys	Deny keys > 90d	Deny `iam:CreateAccessKey`	`disableServiceAccountKeyCreation`

Data Classification

Every resource gets a classification tag at creation. This isn't optional — untagged resources are denied by policy. The classification determines where data can live, how it's encrypted, and how long it's retained.

Level	Residency	Encryption	Retention
Public	None	In-transit	Per policy
Internal	Preferred region	At-rest + in-transit	3 years
Confidential	Country-level	CMK + in-transit	7 years
Restricted	Specific region	HSM-backed CMK	Per regulation

Resources tagged "Restricted" can only be created in approved regions — preventive policy, not documentation.

Security Operations

All three clouds feed logs into a central SIEM for cross-cloud correlation. Without this, a compromise that spans two providers looks like two unrelated events. Automated response handles containment for known patterns — analysts focus on novel threats.

Severity	Scope	Response SLA
SEV-1	Data breach, active intrusion	15 min engage, 1 hr contain
SEV-2	Policy violation with data exposure risk	1 hr engage, 4 hr contain
SEV-3	Policy drift, non-critical misconfiguration	24 hr
SEV-4	Informational	Next business day

Secrets Management

Secrets are centralized in HashiCorp Vault rather than spread across three native secret stores. The reason is rotation — Vault can issue ephemeral database credentials and short-lived CI/CD tokens that native stores can't. Every secret has a maximum lifetime.

Secret Type	Store	Rotation
Application secrets	HashiCorp Vault	30 days
Database credentials	Vault dynamic secrets	Per-session (ephemeral)
Encryption keys	Azure KV / AWS KMS / GCP KMS	Annual
Service account keys	Prohibited	Workload identity
CI/CD tokens	Vault + OIDC federation	Per-pipeline-run

Database credentials are ephemeral. CI/CD tokens live for minutes. Long-lived credentials are the single largest attack surface — eliminate them entirely.

Vending & Cost

Provisioning a new workload environment should take minutes, not weeks. A vending machine automates the full lifecycle: request, approval, provisioning, baseline policies, RBAC, budget alerts, and network peering — all via IaC pipelines. A developer submits a structured request and the platform handles everything else:

request:
  workload_name: "payment-service"
  business_unit: "fintech"
  environment: "prod"
  cloud_provider: "aws"
  landing_zone: "regulated"
  compliance_scope: ["pci-dss", "sox"]
  data_classification: "restricted"
  regions:
    primary: "us-east-1"
    dr: "us-west-2"
  budget_monthly_usd: 15000

FinOps

Cost visibility is only useful if it's actionable. Every resource is tagged with cost-center, owner, and environment — enforced by policy, not convention. Chargeback happens automatically.

Pillar	Implementation
Visibility	All resources tagged with cost-center, owner, environment (policy-enforced)
Allocation	Chargeback to BU based on actual usage
Optimization	RIs, Savings Plans, CUDs reviewed quarterly — 70% coverage target
Governance	Budget alerts at 50/75/90/100%. Auto-notify owner + manager

Disaster Recovery

DR tiers are assigned per workload based on business impact, not technical preference. Tier 0 is reserved for systems where even minutes of downtime have material consequences.

Tier	RPO	RTO	Strategy
Tier 0	0	< 15 min	Active-active, multi-region
Tier 1	< 1 hr	< 1 hr	Warm standby
Tier 2	< 24 hr	< 4 hr	Pilot light
Tier 3	< 72 hr	< 24 hr	Backup & restore

Cost anomaly detection runs on all three clouds. A >20% day-over-day increase triggers automatic investigation.

Implementation

This architecture is delivered in phases over 12 months. The sequence matters — identity and networking must be in place before landing zones make sense, and compliance tooling is validated against real workloads during migration.

Phase	Duration	Scope
Foundation	Months 1–3	Identity federation, networking hubs, central logging, baseline policies
Landing Zones	Months 3–6	Vending automation, RBAC, CI/CD, first migrations
Compliance	Months 6–9	CSPM, GRC integration, audit preparation, regulated zones
Optimization	Months 9–12	FinOps maturity, commitment purchases, DR testing
Continuous	Ongoing	Policy iteration, new frameworks, sovereign regions, M&A

Team

Building this requires 5–7 engineers during the 12-month build phase, scaling down to 3–4 for steady-state operations. These aren't generalists — each role maps to a specific domain of the architecture.

Role	Count	Scope
Platform Architect	1	Overall design, cross-cloud standards, stakeholder alignment
Cloud Engineers	2–3	IaC modules, vending pipelines, landing zone provisioning (ideally one per primary cloud)
Identity & Security Engineer	1	Entra ID federation, RBAC design, PIM/JIT, Vault integration
Network Engineer	1	Hub-spoke topology, colocation fabric, IP allocation, firewall rules
Compliance / GRC Lead	1	Policy-as-code authoring, audit prep, framework mapping
FinOps Analyst	0–1	Tagging enforcement, chargeback, commitment optimization (can be part-time or shared)

After the build phase, the Platform Architect role transitions to part-time oversight. Cloud Engineers rotate into an on-call model. The steady-state team of 3–4 handles policy updates, new landing zone requests, incident response, and FinOps reviews.

The most common mistake is understaffing identity and networking. These two domains block everything else — if they slip, the entire timeline shifts.

Tooling

Function	Tool
IaC	Terraform (OpenTofu)
Secrets	HashiCorp Vault
CSPM	Wiz / Prisma Cloud
SIEM	Sentinel / Splunk
IdP	Entra ID
Observability	Datadog / Grafana Cloud
GRC	Drata / Vanta
FinOps	CloudHealth / Apptio
Developer Portal	Backstage

This architecture maps directly to Microsoft's Cloud Adoption Framework, AWS Control Tower, and Google Cloud Architecture Framework. The value is in the cross-cloud unification — same principles, same controls, one operating model.