Since before the plague a number of reference architectures for a Azure Landing Zones have emerged. From a Microsoft perspective it seems to have started with the North Star project which eventually became Azure Landing Zones (Enterprise-Scale) - Reference Implementation (first commit May 2020) using ARM templates. A Terraform version – Azure landing zones Terraform module – and a Bicep version Azure Landing Zones (ALZ) - Bicep soon followed.

The current guidance is to use Azure Verified Modules (AVM) to deploy an Azure Landing Zones implementation.

To monitor your Azure platform, deploying an additional project: Azure Monitor Baseline Alerts (AMBA) seems to be the official recommendation.

Complexity

All the reference implementations above suffer from the authors’ incessant need to continuously add more stuff. The implementations have very large and daunting code bases, which means that they are almost impossible to get a grip on - let alone understand how to extend.

To remedy these challenges we introduce a simplified implementation which should allow platform teams to much more easily reason about and understand what they are trying to build.

The simplified version can be found at Azure Landing Zones Demo.

To compare the complexity and maintainability of the solutions mentioned, we can use cloc to get an overall idea of the number of files and lines of code in each implementation. We will only count infrastructure as code and scripts: JSON, HCL, Bicep, Bash, and PowerShell:

cloc --include-lang=JSON,PowerShell,Bourne\ Shell,HCL,Standard\ ML,YAML --force-lang="Standard ML,bicep" [path]
Language ARM Terraform Bicep AVM AMBA Simplified
Bicep 420   11,692 172,920 103,483 1,150
HCL   9,142        
JSON 119,990 41,072 75,722 593,406 738,156 1,328
YAML 740 801 3,063 17,225 18,120 886
PowerShell 5,431 685 1,439 12,912 2,030 455
Bash   406 13 331    
SUM 126,581 52,106 91,929 796,794 861,789 3,819
Files 441 442 690 2,475 3,066 82

Assuming you prefer Terraform, you need to inherit, support, understand, and reason about at least 52,106 lines of code across 442 files! Then extend the code with your own requirements. This is going to be really hard even with a reasonably sized team (4-6 people)

Worst case scenario: You have deployed the original Enterprise Scale version using the Portal Experience [read: ClickOps] and added Baseline Alerts. You now need somehow reverse engineer your setup into Infrastructure as Code while trying to support, understand, and reason about and alert framework consisting of 796,794 lines of code across 2,475 files! This is not hard. This is completely impossible regardless of team size.

Compare this to the simplified version with 3,819 lines of code across 82 files.

Which version would you rather start with?

What does simplified mean here?

To quote the docs:

The conceptual architecture is greatly simplified compared to the official one, as we empower DevOps teams to build and run their own thing.

We do not want to manage network from a centralized perspective. All applications will be deployed as islands with no inter-network connectivity.

We adopt a Zero Trust approach where identity and encryption trumps and often replaces Network Security.

We do not require nor encourage the use of Azure Private Link.

We allow most services to have Public Network Access: Enabled because we rely on enforcing Entra ID authentication and HTTPS/TLS 1.2+.

Online Landing Zones

These are the most important landing zones - all newer applications should be deployed here - even if data resides on-premises.

Connection to on-premises resources should be managed using zero-trust approaches with resources like:

Corp Landing Zones

Corp landing zones should exclusively be used for lift-and-shift scenarios (and avoided all together if possible). This is reserved for applications which do not support modern authentication and relies on Kerberos (Windows Active Directory).

Azure Landing Zones Demo

Comparing policy-driven governance to verified modules

Using Azure Policy we supply a number of number of policies for popular resources: Web Apps, Blob Storage, Key Vault, and SQL.

Having deployed these policies we enforce the following security defaults on storage accounts:

  • HTTPS only (supportsHttpsTrafficOnly)
  • TLS 1.2 (minimumTlsVersion)
  • Disallow blob public access (allowBlobPublicAccess)
  • Disallow cross tenant replication (allowCrossTenantReplication)
  • Disallow shared key access (allowSharedKeyAccess)
  • Default to OAuth (defaultToOAuthAuthentication)
  • Enable Defender for Storage

NB: We use modify and deploy if not exists policy effects to ensure that issues with existing storage accounts are automatically remediated.

NBB: Security relies on zero trust principles of identity-based security (disabling keys) and encryption in transit (HTTPS/TLS 1.2).

Having done this, a storage account can be deployed with a very simple Bicep template:

param location string = resourceGroup().location
param storageAccountName string

resource storageAccount 'Microsoft.Storage/storageAccounts@2023-05-01' = {
  name: storageAccountName
  location: location
  kind: 'StorageV2'
  sku: {
    name: 'Standard_LRS'
  }
  properties: {}
}

or using Azure CLI:

az storage account create -n storage42 -g group -l swedencentral --sku Standard_LRS

The policies ensure that the platform enforces a reasonable set of security defaults, relieving developers from the task.

Compare the 12 lines of code in Bicep above to the Azure Verified Module version which contains 3,531 lines of Bicep across 29 files (738 lines in the root file).

Yes, the official module can do more stuff (mostly YAGNI), but we must ask the question: Which implementation would you rather reason about and support going forward?

The same principles apply for web apps, key vaults, and SQL. This can be extended quite easily but we deliberately want to keep the reference implementation simple. Pull requests are welcome, though.

What about the corporate network?

Cloud applications should never be connected to the on-premises network on the network layer. Doing so adds an unnecessary dependency and makes things less secure. Even for lift and shift of legacy applications where a connection to the on-premises network seems like the only option there are often more secure alternatives like Microsoft Entra Domain Services. If all else fails and you must connect on-premises with IPv4 this will be equal parts expensive and complex while relying on your organisation’s existing network setup. Because of this we do not want to or mandate a reference architecture. This must be done bespoke every time.

Once again, we still recommend to not connect the corporate network at all and rely on Azure Relay and Azure Service Bus instead.

Conclusion

We hope this project can serve as a reminder that often less is more and getting started should never require you to deploy almost a million lines of code you don’t understand.

Check out Azure Landing Zones Demo and let us know what you think using Issues, Stars, and Pull Requests.