← Terug naar Downloads DataPartner365 — datapartner365.nl
DataPartner365 · Stappenplan · 2026

IaC opzetten voor Databricks
in een Azure Tenant

Een compleet stappenplan voor het opzetten van Infrastructure as Code (Terraform) voor een Databricks workspace in Azure — van Service Principal tot Unity Catalog en CI/CD pipeline.

Terraform Azure Databricks Unity Catalog GitHub Actions Gratis
1

Prerequisites & tooling

Wat je nodig hebt voordat je begint
ToolVersieVereist?Doel
Terraform≥ 1.6VereistIaC engine
Azure CLI≥ 2.55VereistAzure authenticatie & beheer
Git≥ 2.40VereistVersiebeheer
VS Code + Terraform ext.LaatsteOptioneelIDE met syntax highlighting
Databricks CLI≥ 0.200OptioneelHandmatige checks en debugging
ℹ️ Je hebt een Azure subscription nodig met minimaal Contributor-rechten op resource group niveau, en Application Administrator in Entra ID om een Service Principal aan te maken.

Installatie controleren

bash
# Controleer versies
terraform --version     # → Terraform v1.6+
az --version            # → azure-cli 2.55+
git --version           # → git version 2.40+

# Login bij Azure
az login
az account show         # controleer juiste subscription
az account set --subscription "<jouw-subscription-id>"
2

Azure omgeving voorbereiden

Resource Group, Service Principal, Key Vault en Terraform state storage

2.1 — Resource Group aanmaken

bash — azure cli
RG_NAME="rg-databricks-prod"
LOCATION="westeurope"

az group create \
  --name $RG_NAME \
  --location $LOCATION \
  --tags Environment=Production Project=DataPlatform ManagedBy=Terraform

2.2 — Service Principal voor Terraform

bash — azure cli
SP_NAME="sp-terraform-databricks"
SUBSCRIPTION_ID=$(az account show --query id -o tsv)

# Aanmaken met Contributor rechten op de resource group
az ad sp create-for-rbac \
  --name $SP_NAME \
  --role Contributor \
  --scopes /subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RG_NAME \
  --sdk-auth

# Sla de output op — je hebt deze waarden nodig:
# appId        → ARM_CLIENT_ID
# password     → ARM_CLIENT_SECRET
# tenant       → ARM_TENANT_ID
# subscriptionId → ARM_SUBSCRIPTION_ID
⚠️ Sla de password direct op — deze is maar één keer zichtbaar. Bewaar hem in Azure Key Vault of GitHub Secrets, nooit in code.

2.3 — Key Vault voor secrets

bash
az keyvault create \ --name "kv-databricks-prod" \ --resource-group $RG_NAME \ --location $LOCATION \ --enable-rbac-authorization true # Sla SP secret op in Key Vault az keyvault secret set \ --vault-name "kv-databricks-prod" \ --name "terraform-sp-secret" \ --value "<sp-password>"

2.4 — Remote state storage (Terraform backend)

bash
SA_NAME="stterraformstate001" # wereldwijd unieke naam az storage account create \ --name $SA_NAME \ --resource-group $RG_NAME \ --location $LOCATION \ --sku Standard_LRS \ --min-tls-version TLS1_2 az storage container create \ --name "tfstate" \ --account-name $SA_NAME
3

Terraform project opzetten

Mapstructuur, providers en backend configuratie

Aanbevolen mapstructuur

bestandsstructuur
databricks-iac/ ├── main.tf # Hoofd resources ├── variables.tf # Input variabelen ├── outputs.tf # Output waarden ├── providers.tf # Provider configuratie ├── backend.tf # Remote state ├── terraform.tfvars # Waarden (niet committen!) ├── .gitignore └── modules/ ├── workspace/ # Databricks workspace module ├── unity-catalog/ # Unity Catalog module └── clusters/ # Cluster policies module

providers.tf

terraform hcl
terraform { required_version = ">= 1.6" required_providers { azurerm = { source = "hashicorp/azurerm" version = "~> 3.90" } databricks = { source = "databricks/databricks" version = "~> 1.38" } } } provider "azurerm" { features {} # Authenticatie via env vars: # ARM_CLIENT_ID, ARM_CLIENT_SECRET, ARM_TENANT_ID, ARM_SUBSCRIPTION_ID } provider "databricks" { host = azurerm_databricks_workspace.main.workspace_url azure_workspace_resource_id = azurerm_databricks_workspace.main.id }

backend.tf

terraform hcl
terraform { backend "azurerm" { resource_group_name = "rg-databricks-prod" storage_account_name = "stterraformstate001" container_name = "tfstate" key = "databricks.terraform.tfstate" } }

.gitignore

.gitignore
.terraform/ *.tfstate *.tfstate.backup *.tfvars # bevat secrets — nooit committen .terraform.lock.hcl
Gebruik terraform.tfvars.example als template (zonder echte waarden) die je wél commit, zodat collega's weten welke variabelen nodig zijn.
4

Databricks workspace deployen

VNet injection, private endpoints en workspace aanmaken

variables.tf

terraform hcl
variable "location" { default = "westeurope" } variable "resource_group" { default = "rg-databricks-prod" } variable "environment" { default = "prod" } variable "workspace_name" { default = "dbw-dataplatform-prod" } variable "sku" { default = "premium" } # premium vereist voor Unity Catalog

main.tf — Workspace resource

terraform hcl
resource "azurerm_databricks_workspace" "main" { name = var.workspace_name resource_group_name = var.resource_group location = var.location sku = var.sku # "premium" voor Unity Catalog # Managed resource group voor Databricks-beheerde resources managed_resource_group_name = "rg-databricks-managed-prod" tags = { Environment = var.environment ManagedBy = "Terraform" Project = "DataPlatform" } } output "workspace_url" { value = azurerm_databricks_workspace.main.workspace_url } output "workspace_id" { value = azurerm_databricks_workspace.main.id }

Deployen

bash
# Exporteer SP credentials als env vars export ARM_CLIENT_ID="<appId>" export ARM_CLIENT_SECRET="<password>" export ARM_TENANT_ID="<tenant>" export ARM_SUBSCRIPTION_ID="<subscriptionId>" terraform init # providers downloaden + backend initialiseren terraform plan # bekijk wat er aangemaakt wordt terraform apply # uitvoeren (bevestig met 'yes')
5

Unity Catalog configureren

Metastore, catalogs, schemas en rechten via Terraform
ℹ️ Unity Catalog vereist een Premium SKU workspace. De metastore wordt op account-niveau aangemaakt (één per regio). Als je al een metastore hebt, sla stap 5.1 over.

5.1 — Storage voor Unity Catalog (ADLS Gen2)

terraform hcl
resource "azurerm_storage_account" "unity" { name = "stunity001" resource_group_name = var.resource_group location = var.location account_tier = "Standard" account_replication_type = "LRS" is_hns_enabled = true # ADLS Gen2 vereist min_tls_version = "TLS1_2" } resource "azurerm_storage_container" "unity" { name = "unity-catalog" storage_account_name = azurerm_storage_account.unity.name container_access_type = "private" } resource "databricks_metastore" "main" { name = "metastore-prod" storage_root = "abfss://unity-catalog@${azurerm_storage_account.unity.name}.dfs.core.windows.net/" region = var.location force_destroy = false } resource "databricks_metastore_assignment" "main" { metastore_id = databricks_metastore.main.id workspace_id = azurerm_databricks_workspace.main.workspace_id }

5.2 — Catalog en schemas aanmaken

terraform hcl
resource "databricks_catalog" "bronze" { name = "bronze" comment = "Raw inkomende data — ongewijzigd" depends_on = [databricks_metastore_assignment.main] } resource "databricks_catalog" "silver" { name = "silver" comment = "Gecleande en gevalideerde data" } resource "databricks_catalog" "gold" { name = "gold" comment = "Business-ready data voor rapportages" } # Schema's per catalog resource "databricks_schema" "bronze_raw" { catalog_name = databricks_catalog.bronze.name name = "raw" } resource "databricks_schema" "silver_clean" { catalog_name = databricks_catalog.silver.name name = "clean" } resource "databricks_schema" "gold_reporting" { catalog_name = databricks_catalog.gold.name name = "reporting" }
6

Clusters & policies via Terraform

Cluster policies, instance pools en service principals

6.1 — Cluster policy voor jobs

terraform hcl
resource "databricks_cluster_policy" "jobs" { name = "Jobs Cluster Policy" definition = jsonencode({ "spark_version" : { "type" : "allowlist", "values" : ["14.3.x-scala2.12", "15.4.x-scala2.12"], "defaultValue" : "15.4.x-scala2.12" }, "node_type_id" : { "type" : "allowlist", "values" : ["Standard_DS3_v2", "Standard_DS4_v2"] }, "autotermination_minutes" : { "type" : "fixed", "value" : 30, "hidden" : true }, "data_security_mode" : { "type" : "fixed", "value" : "SINGLE_USER" } }) }

6.2 — Service Principal voor pipelines

terraform hcl
resource "databricks_service_principal" "pipeline_sp" { application_id = "<aad-app-id>" display_name = "sp-databricks-pipelines" active = true } resource "databricks_group_member" "pipeline_sp_admin" { group_id = databricks_group.data_engineers.id member_id = databricks_service_principal.pipeline_sp.id }
7

CI/CD pipeline met GitHub Actions

Automatisch plan op PR, apply op merge naar main

GitHub Secrets instellen

Voeg deze secrets toe in GitHub → Settings → Secrets and variables → Actions:

Secret naamWaarde
ARM_CLIENT_IDService Principal appId
ARM_CLIENT_SECRETService Principal password
ARM_TENANT_IDAzure tenant ID
ARM_SUBSCRIPTION_IDAzure subscription ID

.github/workflows/terraform.yml

yaml — github actions
name: Terraform Databricks IaC on: push: branches: [main] pull_request: branches: [main] env: ARM_CLIENT_ID: ${{ secrets.ARM_CLIENT_ID }} ARM_CLIENT_SECRET: ${{ secrets.ARM_CLIENT_SECRET }} ARM_TENANT_ID: ${{ secrets.ARM_TENANT_ID }} ARM_SUBSCRIPTION_ID: ${{ secrets.ARM_SUBSCRIPTION_ID }} jobs: terraform: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Setup Terraform uses: hashicorp/setup-terraform@v3 with: terraform_version: 1.6.6 - name: Terraform Init run: terraform init - name: Terraform Format Check run: terraform fmt -check - name: Terraform Validate run: terraform validate - name: Terraform Plan run: terraform plan -no-color if: github.event_name == 'pull_request' - name: Terraform Apply run: terraform apply -auto-approve if: github.ref == 'refs/heads/main' && github.event_name == 'push'
Voeg branch protection rules toe op main zodat Terraform Plan altijd slaagt vóór merge. Zo voorkom je dat een foutieve config live gaat.
8

Volledige checklist

Print uit en vink af terwijl je implementeert
Azure Omgeving
  • Resource Group aangemaakt
  • Service Principal aangemaakt met Contributor rechten
  • SP credentials opgeslagen in Key Vault
  • Storage Account voor Terraform state aangemaakt
  • ADLS Gen2 storage aangemaakt voor Unity Catalog
Terraform Project
  • providers.tf geconfigureerd (azurerm + databricks provider)
  • backend.tf geconfigureerd met remote state
  • .gitignore ingericht (geen .tfvars of state in Git)
  • terraform init succesvol uitgevoerd
  • terraform plan toont verwachte resources
Databricks Workspace
  • Workspace aangemaakt met Premium SKU
  • terraform apply succesvol afgerond
  • Workspace toegankelijk via URL
Unity Catalog
  • Metastore aangemaakt (of bestaande hergebruikt)
  • Metastore gekoppeld aan workspace
  • Bronze / Silver / Gold catalogs aangemaakt
  • Schema's per catalog aangemaakt
  • Rechten correct ingesteld
Clusters & Security
  • Cluster policies aangemaakt voor jobs en interactive
  • Service Principal voor pipelines geconfigureerd
  • Autotermination ingesteld (max 30 min)
CI/CD Pipeline
  • GitHub Secrets ingesteld (ARM_* variabelen)
  • GitHub Actions workflow aangemaakt
  • Branch protection op main ingesteld
  • Test PR aangemaakt — plan loopt succesvol
  • Merge naar main — apply loopt succesvol