Wat is YAML? Complete Gids voor Configuration as Code 2025
Leer YAML van begin tot eind: syntax, best practices en praktische toepassingen voor Kubernetes, CI/CD pipelines, data engineering en moderne DevOps workflows.
Zoek je DevOps Engineers met YAML expertise?
Vind ervaren professionals gespecialiseerd in Kubernetes, CI/CD en Infrastructure as Code
Inhoudsopgave
1. Wat is YAML en waarom is het belangrijk?
YAML Definitie
YAML (YAML Ain't Markup Language, oorspronkelijk Yet Another Markup Language) is een leesbaar data-serialisatie formaat ontworpen voor mens-machine interactie. Het wordt veel gebruikt voor configuration files, data exchange en Infrastructure as Code.
Human Readable
Eenvoudig leesbaar voor zowel mensen als machines
Simple Syntax
Minimalistische syntax zonder curly braces
Rich Data Types
Ondersteunt lists, dictionaries, scalars en complexe structuren
Cloud Native
Standaard voor Kubernetes, Docker, en cloud configuratie
| Use Case | Waarom YAML? | Voorbeelden |
|---|---|---|
| Kubernetes | Standaard configuratieformaat voor alle resources | Deployments, Services, ConfigMaps, Secrets |
| CI/CD Pipelines | Declaratieve pipeline definitie | GitHub Actions, GitLab CI, Azure DevOps |
| Data Engineering | Configuration voor data pipelines en workflows | Apache Airflow DAGs, dbt project config |
| Infrastructure as Code | Resource definitions en templates | Ansible Playbooks, CloudFormation |
| Application Config | Environment-specific settings | Spring Boot, Django settings |
Waarom YAML essentieel is voor moderne ontwikkeling
Team Collaboration
- Leesbaar voor niet-technische stakeholders
- Eenvoudige code reviews
- Version control friendly
DevOps Integration
- Standaard in alle major DevOps tools
- GitOps workflow compatibel
- Automation ready
Safety & Validation
- Schema validation mogelijk
- Type safety met tools
- Security scanning integration
Scalability
- Handelt complexe configuraties
- Modulariteit via anchors en aliases
- Reusability van configuratie
2. YAML Basis Syntax
YAML Syntax Fundamenten
YAML gebruikt indentatie (spaties) voor structuur en heeft een minimalistische syntax zonder speciale karakters zoals curly braces of brackets.
Basis Syntax Regels
# ========== COMMENTAAR ==========
# Comments beginnen met hash (#) en gaan tot einde regel
# Dit is een comment in YAML
# ========== KEY-VALUE PAIRS ==========
# Eenvoudige key-value pairs
naam: "Jan Jansen"
leeftijd: 30
actief: true
salaris: 55000.50
# ========== INDENTATIE ==========
# Gebruik SPATIES (geen tabs!) voor indentatie
# Standaard: 2 spaties per niveau
persoon:
naam: "Marie de Vries"
adres:
straat: "Hoofdstraat 123"
postcode: "1234 AB"
stad: "Amsterdam"
# ========== LISTS/ARRAYS ==========
# Lists met dash (-) syntax
boodschappen:
- "melk"
- "eieren"
- "brood"
- "kaas"
# List van objects
gebruikers:
- naam: "user1"
email: "user1@example.com"
- naam: "user2"
email: "user2@example.com"
# ========== MULTILINE STRINGS ==========
# Pipe (|) behoudt newlines
beschrijving: |
Dit is een multiline string
Die nieuwe regels behoudt.
Handig voor lange beschrijvingen.
# Greater than (>) vouwt newlines tot spaties
samenvatting: >
Dit is een lange zin die
wordt omgevouwen tot één
enkele regel in output.
# ========== SPECIALE KARAKTERS ==========
# Quotes voor speciale karakters
speciale_string: "Waarde met: dubbele punt, en andere tekens"
boolean_string: "true" # String, niet boolean!
getal_string: "123" # String, niet number!
# ========== NULL WAARDEN ==========
lege_waarde: null # Expliciete null
ook_leeg: # Impliciete null (geen waarde)
Complexe Structuren
# ========== NESTED STRUCTURES ==========
organisatie:
naam: "DataPartner365"
oprichtingsjaar: 2020
medewerkers:
- naam: "Lisa"
rol: "Data Engineer"
skills:
- "Python"
- "SQL"
- "Apache Spark"
- naam: "Mark"
rol: "Data Scientist"
skills:
- "Machine Learning"
- "R"
- "Statistics"
# ========== INLINE SYNTAX (FLOW STYLE) ==========
# Gebruik voor eenvoudige structuren
inline_list: ["item1", "item2", "item3"]
inline_map: {naam: "Jan", leeftijd: 30, actief: true}
# Gemengde inline en block style
configuratie:
database:
host: "localhost"
port: 5432
tables: ["users", "products", "orders"]
# ========== MULTIDOCUMENT FILES ==========
# Drie streepjes (---) scheiden documenten
---
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
log_level: INFO
---
apiVersion: v1
kind: Secret
metadata:
name: database-secret
type: Opaque
data:
username: YWRtaW4= # base64 encoded
# ========== DIRECTIVES ==========
%YAML 1.2 # YAML versie
%TAG ! tag:example.com,2025: # Custom tag prefix
---
complex_value: !CustomType
field1: "value1"
field2: "value2"
3. Data Typen en Structuren
YAML Data Types
YAML ondersteunt verschillende data typen die automatisch worden herkend of expliciet kunnen worden gespecificeerd.
Scalar Types (Enkele waarden)
# ========== STRINGS ==========
# Plain (unquoted) strings
naam: Jan Jansen
stad: Amsterdam
# Single quoted strings (preserve alles letterlijk)
pad: 'C:\Users\Jan\Documents'
speciaal: 'Dit is een single ''quote'' binnen quotes'
# Double quoted strings (escape sequences)
beschrijving: "Regel 1\nRegel 2\tTab"
unicode: "Euro teken: \u20AC"
# ========== NUMBERS ==========
# Integers
aantal: 42
negatief: -100
octal: 0o755 # Octal: 493 in decimal
hex: 0xFF # Hex: 255 in decimal
# Floating point
prijs: 19.99
wetenschappelijk: 6.022e23 # 6.022 × 10²³
oneindig: .inf # Infinity
not_a_number: .nan # Not a Number
# ========== BOOLEANS ==========
actief: true
uitgeschakeld: false
# Alternatieve boolean notaties
ja: yes # Wordt omgezet naar true
nee: no # Wordt omgezet naar false
aan: on # Wordt omgezet naar true
uit: off # Wordt omgezet naar false
# ========== NULL ==========
leeg: null
ook_null: Null # Case insensitive
nog_null: NULL # Case insensitive
tilde: ~ # Alternate null notation
# ========== DATES EN TIMES ==========
# ISO 8601 format
datum: 2025-12-20
datetime: 2025-12-20T15:30:00Z
datetime_local: 2025-12-20T15:30:00+01:00
tijd: 15:30:00
# ========== TYPE CONVERSION ==========
# Explicit type tags
string_getal: !!str 123 # Wordt "123"
getal_string: !!int "456" # Wordt 456
float_explicit: !!float "7.89" # Wordt 7.89
boolean_explicit: !!bool "true" # Wordt true
Collection Types
# ========== SEQUENCES (LISTS/ARRAYS) ==========
# Block style sequence
programmeertalen:
- Python
- JavaScript
- Java
- Go
# Flow style sequence (inline)
databases: [PostgreSQL, MySQL, MongoDB, Redis]
# Nested sequences
matrix:
- [1, 2, 3]
- [4, 5, 6]
- [7, 8, 9]
# ========== MAPPINGS (DICTIONARIES/OBJECTS) ==========
# Block style mapping
database_config:
host: localhost
port: 5432
username: admin
password: secret123
# Flow style mapping (inline)
user: {name: Jan, age: 30, active: true}
# Complex nested structure
application:
name: MyApp
version: "1.0.0"
environments:
- name: development
database:
host: dev-db.local
port: 5432
features:
- debug_mode
- hot_reload
- name: production
database:
host: prod-db.company.com
port: 5432
features:
- monitoring
- backup
# ========== SETS ==========
# Unieke waardes (YAML 1.2)
unieke_items: !!set
? item1
? item2
? item3
# ========== OMAPPING (ORDERED MAP) ==========
# Behoudt volgorde van keys
geordende_config: !!omap
- stap1: init
- stap2: validate
- stap3: process
- stap4: cleanup
# ========== PAIRS ==========
# Behoudt duplicate keys
duplicate_keys: !!pairs
- taal: Nederlands
- taal: English
- taal: Français
4. Geavanceerde YAML Features
Krachtige YAML Features
Leer geavanceerde YAML features die configuratie herbruikbaar en dynamisch maken.
Anchors en Aliases (Hergebruik)
# ========== ANCHORS (&) ==========
# Definieer herbruikbare template met anchor
database_defaults: &database_defaults
host: localhost
port: 5432
timeout: 30
ssl: true
# ========== ALIASES (*) ==========
# Verwijs naar anchor met alias
development_db:
<<: *database_defaults
name: dev_database
username: dev_user
production_db:
<<: *database_defaults
name: prod_database
host: prod.db.example.com
username: prod_user
timeout: 60 # Override default
# ========== MULTIPLE ANCHORS ==========
common_settings: &common
environment: production
region: eu-west-1
app_settings: &app
version: "1.0.0"
debug: false
# Combineer multiple anchors
full_config:
<<: [*common, *app]
specific_setting: custom_value
# ========== NESTED ANCHORS ==========
base_service: &base_service
image: &base_image nginx:latest
ports:
- &port_80 80:80
web_service:
<<: *base_service
environment:
- NGINX_ENV=production
api_service:
image: *base_image
ports:
- *port_80
- 443:443
# ========== MERGE KEY (<<) ==========
# Officiële YAML merge key (niet alle parsers)
defaults: &defaults
adapter: postgresql
encoding: unicode
development:
<<: *defaults
database: dev_db
# Complex merge voorbeeld
base: &base
name: base
overrides:
key1: value1
extended:
<<: *base
name: extended # Override name
overrides:
<<: *base.overrides # Merge nested
key2: value2 # Add new key
Custom Tags en Templates
# ========== CUSTOM TAGS ==========
%TAG ! tag:example.com,2025:
---
# Gebruik custom tag
my_object: !CustomType
field1: value1
field2: value2
# ========== MULTILINE STRINGS ==========
# Literal scalar (behoud newlines)
script: |
#!/bin/bash
echo "Hello World"
date
whoami
# Folded scalar (vouw newlines)
paragraph: >
Dit is een hele lange paragraaf die
over meerdere regels geschreven is maar
wordt omgevouwen tot één regel.
# Keep trailing newlines (|+)
with_trailing_newlines: |+
Line 1
Line 2
# Strip trailing newlines (|-)
strip_newlines: |-
Line 1
Line 2
# ========== COMMENTS IN MULTILINE ==========
commented_script: |
#!/bin/bash
# Dit is een comment in een script
echo "Start script"
# Main logic
for i in {1..10}; do
echo "Iteration $i"
done
# ========== TEMPLATE VARIABLES ==========
# Sommige tools ondersteunen template syntax
# Bijvoorbeeld in Ansible of Helm
config:
app_name: "{{ .Values.app.name }}"
replicas: "{{ .Values.replicaCount }}"
environment: "{{ .Values.environment | default "production" }}"
# ========== CONDITIONAL YAML ==========
# Niet native in YAML, maar toolspecifiek
# Bijvoorbeeld Azure DevOps conditions
stages:
- stage: Build
condition: succeeded()
- stage: Test
condition: and(succeeded(), eq(variables['RunTests'], 'true'))
# ========== YAML DIRECTIVES ==========
%YAML 1.2
%TAG !e! tag:example.com,2025:
---
document1:
key: value
---
%TAG !t! tag:test.com,2025:
document2:
custom: !t!special value
Kubernetes Experts nodig?
Vind ervaren DevOps Engineers gespecialiseerd in YAML en cloud-native technologieën
5. YAML in Kubernetes
Kubernetes Resource Definitions
Kubernetes gebruikt YAML als primair formaat voor alle resource definitions. Hier zijn essentiële patronen.
Basic Kubernetes Resources
# ========== POD DEFINITION ==========
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
labels:
app: nginx
environment: development
spec:
containers:
- name: nginx-container
image: nginx:1.21
ports:
- containerPort: 80
env:
- name: ENVIRONMENT
value: dev
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
# ========== DEPLOYMENT ==========
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
namespace: production
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.21-alpine
ports:
- containerPort: 80
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 5
periodSeconds: 5
# ========== SERVICE ==========
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- name: http
protocol: TCP
port: 80
targetPort: 80
type: LoadBalancer
# ========== CONFIGMAP ==========
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
namespace: default
data:
log_level: INFO
database_url: postgresql://localhost:5432/mydb
app_settings: |
feature.enabled=true
cache.timeout=300
api.rate_limit=100
# ========== SECRET ==========
apiVersion: v1
kind: Secret
metadata:
name: database-secret
type: Opaque
data:
username: YWRtaW4= # admin in base64
password: c2VjcmV0MTIz # secret123 in base64
Advanced Kubernetes Patterns
# ========== HELM TEMPLATES ==========
# Helm gebruikt Go templates in YAML
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Chart.Name }}-deployment
labels:
{{- include "mychart.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
{{- include "mychart.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "mychart.selectorLabels" . | nindent 8 }}
spec:
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- name: http
containerPort: {{ .Values.service.port }}
protocol: TCP
# ========== KUSTOMIZE OVERLAYS ==========
# base/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- deployment.yaml
- service.yaml
commonLabels:
app: myapp
managed-by: kustomize
# overlays/production/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../base
namespace: production
replicas:
- name: myapp-deployment
count: 5
patchesStrategicMerge:
- increase-resources.yaml
configMapGenerator:
- name: app-config
behavior: merge
literals:
- ENVIRONMENT=production
- LOG_LEVEL=WARN
# ========== CRDs (CUSTOM RESOURCE DEFINITIONS) ==========
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: databases.example.com
spec:
group: example.com
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
databaseName:
type: string
size:
type: string
enum: [small, medium, large]
scope: Namespaced
names:
plural: databases
singular: database
kind: Database
shortNames:
- db
# ========== MULTI-DOCUMENT KUBERNETES FILE ==========
# Meerdere resources in één YAML file
---
apiVersion: v1
kind: Namespace
metadata:
name: myapp-production
---
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
namespace: myapp-production
data:
environment: production
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-deployment
namespace: myapp-production
spec:
replicas: 3
template:
# ... pod template ...
6. YAML voor CI/CD Pipelines
Pipeline Configuration
Moderne CI/CD tools gebruiken YAML voor pipeline definitions. Hier zijn voorbeelden voor populaire platforms.
GitHub Actions Workflow
# .github/workflows/ci.yml
name: CI Pipeline
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
env:
NODE_VERSION: '18'
PYTHON_VERSION: '3.11'
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
node-version: ['16', '18', '20']
steps:
- uses: actions/checkout@v3
- name: Setup Node.js
uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node-version }}
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm test
- name: Run linting
run: npm run lint
build:
needs: test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Login to DockerHub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Build and push
uses: docker/build-push-action@v4
with:
context: .
push: true
tags: |
${{ secrets.DOCKER_USERNAME }}/myapp:latest
${{ secrets.DOCKER_USERNAME }}/myapp:${{ github.sha }}
deploy:
needs: build
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
environment: production
steps:
- uses: actions/checkout@v3
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: eu-west-1
- name: Deploy to EKS
run: |
aws eks update-kubeconfig --name my-cluster
kubectl apply -f k8s/
kubectl rollout status deployment/myapp-deployment
Azure DevOps Pipeline
# azure-pipelines.yml
trigger:
branches:
include:
- main
- develop
paths:
exclude:
- README.md
- docs/*
variables:
- name: buildConfiguration
value: Release
- name: dockerRegistry
value: myregistry.azurecr.io
- group: production-variables
stages:
- stage: Build
displayName: Build stage
jobs:
- job: BuildAndTest
pool:
vmImage: ubuntu-latest
steps:
- task: Docker@2
displayName: Build Docker image
inputs:
containerRegistry: $(dockerRegistry)
repository: myapp
command: build
Dockerfile: Dockerfile
tags: |
$(Build.BuildId)
latest
- task: DotNetCoreCLI@2
displayName: Run unit tests
inputs:
command: test
projects: '**/*Tests.csproj'
arguments: '--configuration $(buildConfiguration)'
- task: PublishTestResults@2
displayName: Publish test results
inputs:
testResultsFormat: VSTest
testResultsFiles: '**/*.trx'
- stage: DeployToStaging
displayName: Deploy to staging
dependsOn: Build
condition: succeeded()
jobs:
- job: Deploy
pool:
vmImage: ubuntu-latest
steps:
- task: KubernetesManifest@0
displayName: Deploy to Kubernetes
inputs:
action: deploy
namespace: staging
manifests: |
$(Build.SourcesDirectory)/manifests/deployment.yaml
$(Build.SourcesDirectory)/manifests/service.yaml
imagePullSecrets: registry-secret
- stage: DeployToProduction
displayName: Deploy to production
dependsOn: DeployToStaging
condition: and(succeeded(), eq(variables['DeployProd'], 'true'))
jobs:
- job: DeployProd
pool:
vmImage: ubuntu-latest
steps:
- task: AzureKeyVault@1
displayName: Get production secrets
inputs:
azureSubscription: 'Production Subscription'
KeyVaultName: 'prod-keyvault'
SecretsFilter: '*'
- task: Kubernetes@1
displayName: Deploy to production
inputs:
connectionType: Azure Resource Manager
azureSubscriptionEndpoint: 'Production Subscription'
azureResourceGroup: 'prod-rg'
kubernetesCluster: 'prod-aks'
command: apply
arguments: -f manifests/ -n production
7. YAML in Data Engineering
Data Pipeline Configuration
Data engineering tools gebruiken YAML voor workflow definitions, pipeline configuraties en data transformations.
Apache Airflow DAG Definition
# dags/data_pipeline.py (YAML in Python string)
from datetime import datetime, timedelta
from airflow import DAG
from airflow.operators.python import PythonOperator
from airflow.operators.bash import BashOperator
from airflow.providers.docker.operators.docker import DockerOperator
# DAG definition in YAML-like structure via code
default_args = {
'owner': 'data_engineering',
'depends_on_past': False,
'email_on_failure': True,
'email_on_retry': False,
'retries': 3,
'retry_delay': timedelta(minutes=5),
'start_date': datetime(2025, 1, 1),
}
dag = DAG(
'data_processing_pipeline',
default_args=default_args,
description='ETL pipeline for customer data',
schedule_interval='0 2 * * *', # Dagelijks om 2:00
catchup=False,
tags=['etl', 'data-warehouse', 'daily'],
)
# Airflow 2.0+ TaskFlow API met YAML config
with DAG(
'modern_data_pipeline',
default_args=default_args,
schedule_interval='@daily',
) as dag:
@dag.task
def extract():
"""Extract data from source systems."""
return {'status': 'extracted', 'rows': 1000}
@dag.task
def transform(data: dict):
"""Transform extracted data."""
return {'status': 'transformed', 'rows': data['rows']}
@dag.task
def load(data: dict):
"""Load transformed data to warehouse."""
return {'status': 'loaded', 'rows': data['rows']}
# Task dependencies
extracted = extract()
transformed = transform(extracted)
load(transformed)
# ========== DOCKER COMPOSE VOOR AIRFLOW ==========
# docker-compose.yaml
version: '3.8'
services:
postgres:
image: postgres:13
environment:
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=airflow
- POSTGRES_DB=airflow
volumes:
- postgres-db-volume:/var/lib/postgresql/data
airflow-webserver:
image: apache/airflow:2.7.1
environment:
- AIRFLOW__CORE__EXECUTOR=LocalExecutor
- AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@postgres/airflow
- AIRFLOW__CORE__LOAD_EXAMPLES=false
volumes:
- ./dags:/opt/airflow/dags
- ./logs:/opt/airflow/logs
- ./plugins:/opt/airflow/plugins
ports:
- "8080:8080"
command: webserver
depends_on:
- postgres
airflow-scheduler:
image: apache/airflow:2.7.1
environment:
- AIRFLOW__CORE__EXECUTOR=LocalExecutor
- AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@postgres/airflow
volumes:
- ./dags:/opt/airflow/dags
- ./logs:/opt/airflow/logs
- ./plugins:/opt/airflow/plugins
command: scheduler
depends_on:
- postgres
- airflow-webserver
volumes:
postgres-db-volume:
dbt (Data Build Tool) Project Configuration
# dbt_project.yml
name: my_dbt_project
version: '1.0.0'
config-version: 2
profile: my_dbt_project
model-paths: ["models"]
seed-paths: ["data"]
test-paths: ["tests"]
analysis-paths: ["analyses"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]
target-path: "target" # directory for compiled files
clean-targets: # directories to be removed by `dbt clean`
- "target"
- "dbt_packages"
# Models configuration
models:
my_dbt_project:
# Applies to all files under models/example/
example:
materialized: table
schema: analytics
# Subdirectory-specific config
staging:
materialized: view
schema: staging
tags: ["staging"]
stg_customers:
materialized: ephemeral
tags: ["customers", "staging"]
marts:
materialized: table
schema: marts
tags: ["marts"]
finance:
materialized: incremental
unique_key: id
tags: ["finance", "marts"]
# Seeds configuration
seeds:
my_dbt_project:
schema: raw_data
quote_columns: false
country_codes:
+column_types:
country_code: varchar(2)
country_name: varchar(100)
# Tests configuration
tests:
my_dbt_project:
data_quality:
# Custom test configurations
unique:
severity: warn
not_null:
severity: error
accepted_values:
severity: warn
# Variables for Jinja templating
vars:
start_date: "2025-01-01"
currency: EUR
timezone: Europe/Amsterdam
# Environment-specific overrides
# profiles.yml (apart bestand)
my_dbt_project:
target: dev
outputs:
dev:
type: snowflake
account: myaccount.europe-west1.gcp
user: "{{ env_var('DBT_USER') }}"
password: "{{ env_var('DBT_PASSWORD') }}"
role: transformer
database: analytics_dev
warehouse: transforming
schema: dbt
threads: 4
prod:
type: snowflake
account: myaccount.europe-west1.gcp
user: "{{ env_var('DBT_PROD_USER') }}"
password: "{{ env_var('DBT_PROD_PASSWORD') }}"
role: transformer
database: analytics_prod
warehouse: transforming
schema: dbt
threads: 8
8. YAML Best Practices
Essentiële Richtlijnen
Volg deze best practices voor schone, onderhoudbare en veilige YAML configuraties.
Formatting
- Gebruik 2 spaties voor indentatie
- Nooit tabs gebruiken
- Consistente line length (max 80-100)
- Trailing spaces vermijden
Structure
- Group gerelateerde keys samen
- Alphabetische volgorde voor niet-gerelateerde keys
- Comments voor complexe secties
- Anchors voor hergebruik
Security
- Geen secrets in plain text
- Environment variables voor gevoelige data
- Validation met schema
- Git ignore voor lokale config
Maintenance
- Version control voor alle config files
- Changes via Pull Requests
- Linting in CI/CD pipeline
- Documentatie bijwijzen
Praktische Tips
# TIP 1: Use anchors for common configurations
base_config: &base_config
timeout: 30
retries: 3
log_level: INFO
service_a:
<<: *base_config
name: service-a
port: 8080
service_b:
<<: *base_config
name: service-b
port: 8081
# TIP 2: Keep YAML files focused
# ❌ SLECHT: Alles in één groot bestand
# ✅ GOED: Gesplitst per functionaliteit
# config/
# ├── database.yml
# ├── api.yml
# ├── logging.yml
# └── monitoring.yml
# TIP 3: Use multi-document YAML voor Kubernetes
# resource.yaml:
---
apiVersion: v1
kind: ConfigMap
# ... config map spec
---
apiVersion: apps/v1
kind: Deployment
# ... deployment spec
---
apiVersion: v1
kind: Service
# ... service spec
# TIP 4: Avoid complex nesting
# ❌ SLECHT: Te diepe nesting
config:
section1:
subsection1:
subsubsection1:
value: moeilijk te lezen
# ✅ GOED: Flat structure
section1_subsection1_subsubsection1_value: makkelijk te lezen
# TIP 5: Use proper string formatting
# Voor paths, URLs, complexe strings: quotes
path: "C:\\Program Files\\MyApp"
url: "https://api.example.com/v1/resource"
regex: "^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$"
# TIP 6: Document with comments
# Waarom, niet wat (wat is duidelijk uit de code)
retry_count: 3 # 3 retries omdat externe API soms unstable is
timeout: 5000 # 5 seconden timeout voor mobile gebruikers
9. Tools en Validatie
YAML Tooling Ecosystem
Essentiële tools voor het werken met YAML: validatie, linting, formatting en meer.
Validatie Tools
- yamllint: Linting met custom rules
- yq: YAML processor (jq voor YAML)
- kubeval: Kubernetes manifest validatie
- kube-score: Kubernetes best practices
Editors & IDEs
- VS Code: YAML extension + schemas
- IntelliJ IDEA: Built-in YAML support
- Vim/Neovim: vim-yaml plugin
- Sublime Text: Package Control plugins
CLI Tools
- yq: Query en transform YAML
- yaml2json/json2yaml: Convertie tools
- helm: Kubernetes package manager
- kustomize: Kubernetes customization
Security Tools
- checkov: Infrastructure as Code scanning
- trivy: Vulnerability scanner
- gitleaks: Secret detection
- yaml-scan: Security linting
Praktische Tool Commando's
# ========== YAMLLINT ==========
# Installatie
pip install yamllint
# Configuratie: .yamllint.yaml
extends: default
rules:
line-length:
max: 100
allow-non-breakable-words: true
allow-non-breakable-inline-mappings: true
trailing-spaces: enable
document-start: disable # Niet nodig voor Kubernetes
truthy:
allowed-values: ['true', 'false']
# Lint een file
yamllint config.yaml
yamllint -c .yamllint.yaml k8s/
# ========== YQ ==========
# Installatie
brew install yq # macOS
snap install yq # Linux
# Basis queries
yq eval '.metadata.name' deployment.yaml
yq eval '.spec.replicas = 5' -i deployment.yaml # In-place edit
yq eval-all 'select(.kind == "Deployment")' *.yaml
# Complexe queries
yq eval '.spec.template.spec.containers[].env[] | select(.name == "DB_HOST")' deploy.yaml
yq eval '.data | keys' configmap.yaml
# ========== KUBEVAL ==========
# Validatie van Kubernetes manifests
kubeval deployment.yaml
kubeval --strict --schema-location https://raw.githubusercontent.com/yannh/kubernetes-json-schema/master/ manifests/
# ========== VS CODE SETTINGS ==========
# settings.json
{
"yaml.schemas": {
"kubernetes": "*.yaml",
"https://json.schemastore.org/github-workflow.json": ".github/workflows/*",
"https://json.schemastore.org/docker-compose.json": "docker-compose*.yaml"
},
"yaml.customTags": [
"!Ref sequence",
"!GetAtt sequence",
"!Join sequence"
],
"yaml.format.enable": true,
"yaml.format.singleQuote": true,
"yaml.format.bracketSpacing": true
}
# ========== PRECOMMIT HOOKS ==========
# .pre-commit-config.yaml
repos:
- repo: https://github.com/adrienverge/yamllint
rev: v1.32.0
hooks:
- id: yamllint
args: [--strict, --config-file, .yamllint.yaml]
- repo: https://github.com/digitalpulp/pre-commit-yaml
rev: v1.0.0
hooks:
- id: yamlfmt
- repo: https://github.com/instrumenta/kubeval
rev: v0.16.1
hooks:
- id: kubeval
files: ^k8s/.*\.yaml$
10. YAML vs Alternatives (JSON, TOML, HCL)
Configuration Format Comparison
Wanneer gebruik je YAML, JSON, TOML of HCL? Elk formaat heeft zijn sterke punten.
| Formaat | Sterke Punten | Zwakke Punten | Best Voor |
|---|---|---|---|
| YAML | Human readable, comments, anchors, complex types | Indentation errors, complex parsing, security concerns | Kubernetes, CI/CD, config files, complex structures |
| JSON | Universal support, simple parsing, typed | No comments, verbose, hard to read/write | APIs, data exchange, simple config |
| TOML | Simple, explicit, good for flat configs | Limited nesting, less tooling | App config (Rust/Cargo), simple key-value |
| HCL | Expressive, variables, functions, Terraform | Terraform-specific, learning curve | Infrastructure as Code (Terraform) |
| XML | Strict schema validation, namespaces | Verbose, complex, outdated | Legacy systems, SOAP APIs |
Vergelijkende Voorbeelden
# ========== ZELFDE CONFIG IN VERSCHILLENDE FORMATEN ==========
# YAML
database:
host: localhost
port: 5432
credentials:
username: admin
password: secret123 # TODO: Move to env var
# JSON
{
"database": {
"host": "localhost",
"port": 5432,
"credentials": {
"username": "admin",
"password": "secret123"
}
}
}
# TOML
[database]
host = "localhost"
port = 5432
[database.credentials]
username = "admin"
password = "secret123"
# HCL (Terraform)
database {
host = "localhost"
port = 5432
credentials {
username = "admin"
password = var.database_password # Variable reference
}
}
# ========== WANNEER WELK FORMAAT? ==========
# Gebruik YAML als:
# • Human readability belangrijk is
# • Je comments nodig hebt
# • Complexe nesting nodig is
# • Je werkt met Kubernetes/CI/CD
# Gebruik JSON als:
# • Machine-to-machine communicatie
# • Simpele configuraties
# • JavaScript/Web context
# Gebruik TOML als:
# • Flat configuration structure
# • Rust/Cargo projects
# • Simplicity over features
# Gebruik HCL als:
# • Infrastructure as Code (Terraform)
# • Je variables en functions nodig hebt
# • Complexe infrastructure config
11. Praktische Voorbeelden
Real-World YAML Voorbeelden
Complete, werkende YAML configuraties voor veelvoorkomende use cases.
Complete Docker Compose Setup
# docker-compose.yml voor full-stack applicatie
version: '3.8'
services:
# Frontend
frontend:
build:
context: ./frontend
dockerfile: Dockerfile
args:
NODE_ENV: development
ports:
- "3000:3000"
volumes:
- ./frontend:/app
- /app/node_modules
environment:
- REACT_APP_API_URL=http://api:8000
- REACT_APP_ENV=development
depends_on:
- api
networks:
- app-network
# Backend API
api:
build:
context: ./backend
dockerfile: Dockerfile
ports:
- "8000:8000"
volumes:
- ./backend:/app
environment:
- DATABASE_URL=postgresql://user:password@db:5432/mydb
- REDIS_URL=redis://redis:6379
- DEBUG=true
depends_on:
- db
- redis
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
networks:
- app-network
# Database
db:
image: postgres:15-alpine
environment:
- POSTGRES_USER=user
- POSTGRES_PASSWORD=password
- POSTGRES_DB=mydb
volumes:
- postgres_data:/var/lib/postgresql/data
- ./init.sql:/docker-entrypoint-initdb.d/init.sql
ports:
- "5432:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U user"]
interval: 10s
timeout: 5s
retries: 5
networks:
- app-network
# Redis
redis:
image: redis:7-alpine
command: redis-server --appendonly yes
volumes:
- redis_data:/data
ports:
- "6379:6379"
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 5
networks:
- app-network
# Monitoring (Prometheus + Grafana)
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
ports:
- "9090:9090"
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
networks:
- app-network
grafana:
image: grafana/grafana:latest
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
volumes:
- grafana_data:/var/lib/grafana
- ./grafana/provisioning:/etc/grafana/provisioning
ports:
- "3001:3000"
depends_on:
- prometheus
networks:
- app-network
volumes:
postgres_data:
redis_data:
prometheus_data:
grafana_data:
networks:
app-network:
driver: bridge
Complete GitHub Actions Workflow
# .github/workflows/full-pipeline.yml
name: Full CI/CD Pipeline
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
schedule:
- cron: '0 2 * * *' # Dagelijks om 2:00
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
quality-checks:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements-dev.txt
- name: Lint code
run: |
flake8 .
black --check .
isort --check-only .
- name: Security scan
run: |
bandit -r .
safety check
- name: YAML validation
run: |
yamllint .
kubeval k8s/ --strict
test:
needs: quality-checks
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.9', '3.10', '3.11']
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
pip install -r requirements.txt
pip install -r requirements-test.txt
- name: Run tests
run: |
pytest --cov=./ --cov-report=xml
- name: Upload coverage
uses: codecov/codecov-action@v3
with:
file: ./coverage.xml
fail_ci_if_error: true
build-and-push:
needs: test
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Log in to Container Registry
uses: docker/login-action@v2
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v4
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=sha,prefix=,suffix=-{{date 'YYYYMMDD'}}-{{sha}}
type=ref,event=branch
type=ref,event=pr
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
- name: Build and push
uses: docker/build-push-action@v4
with:
context: .
push: ${{ github.event_name != 'pull_request' }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
deploy-staging:
needs: build-and-push
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
environment: staging
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Deploy to staging
run: |
echo "Deploying to staging..."
kubectl config use-context staging
kubectl apply -f k8s/staging/
kubectl rollout status deployment/app-deployment -n staging
deploy-production:
needs: deploy-staging
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
environment: production
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Manual approval
uses: trstringer/manual-approval@v1
with:
secret: ${{ github.token }}
approvers: team-leads
minimum-approvals: 1
- name: Deploy to production
run: |
echo "Deploying to production..."
kubectl config use-context production
kubectl apply -f k8s/production/
kubectl rollout status deployment/app-deployment -n production
Conclusie
YAML is een krachtig en essentieel formaat voor moderne software ontwikkeling, DevOps en data engineering. Zijn leesbaarheid, flexibiliteit en uitgebreide tooling ondersteuning maken het de standaard keuze voor configuration as code in cloud-native ecosystemen.
Key Takeaways:
- YAML's kracht ligt in human readability en complexe data structuren
- Gebruik anchors en aliases voor herbruikbare configuraties
- Altools zoals yamllint en yq om kwaliteit te waarborgen
- Vermijd secrets in YAML files - gebruik environment variables of secret managers
- Kies het juiste formaat voor de job: YAML voor configuratie, JSON voor APIs, HCL voor infrastructure
Ons Advies:
Begin met YAML als je: werkt met Kubernetes, CI/CD pipelines, data engineering tools of complexe configuraties nodig hebt. De learning curve is mild en de voordelen zijn significant voor team collaboration en maintainability.
Blijf leren en oefenen: De echte kracht van YAML komt naar voren in geavanceerde features zoals templates, anchors en multi-document files. Start met eenvoudige configuraties en bouw geleidelijk aan complexiteit op.
DevOps Team nodig?
Plaats je vacature en vind experts in YAML, Kubernetes en cloud-native technologieën