DataPartner365

Jouw partner voor datagedreven groei en inzichten

Wat is YAML? Complete Gids voor Configuration as Code 2025

Bijgewerkt: 20 december 2025
Leestijd: 18 minuten
YAML, Kubernetes, DevOps, CI/CD, Docker, Configuration, Infrastructure as Code

Leer YAML van begin tot eind: syntax, best practices en praktische toepassingen voor Kubernetes, CI/CD pipelines, data engineering en moderne DevOps workflows.

Zoek je DevOps Engineers met YAML expertise?

Vind ervaren professionals gespecialiseerd in Kubernetes, CI/CD en Infrastructure as Code

1. Wat is YAML en waarom is het belangrijk?

YAML Definitie

YAML (YAML Ain't Markup Language, oorspronkelijk Yet Another Markup Language) is een leesbaar data-serialisatie formaat ontworpen voor mens-machine interactie. Het wordt veel gebruikt voor configuration files, data exchange en Infrastructure as Code.

Human Readable

Eenvoudig leesbaar voor zowel mensen als machines

Simple Syntax

Minimalistische syntax zonder curly braces

Rich Data Types

Ondersteunt lists, dictionaries, scalars en complexe structuren

Cloud Native

Standaard voor Kubernetes, Docker, en cloud configuratie

Use Case Waarom YAML? Voorbeelden
Kubernetes Standaard configuratieformaat voor alle resources Deployments, Services, ConfigMaps, Secrets
CI/CD Pipelines Declaratieve pipeline definitie GitHub Actions, GitLab CI, Azure DevOps
Data Engineering Configuration voor data pipelines en workflows Apache Airflow DAGs, dbt project config
Infrastructure as Code Resource definitions en templates Ansible Playbooks, CloudFormation
Application Config Environment-specific settings Spring Boot, Django settings

Waarom YAML essentieel is voor moderne ontwikkeling

Team Collaboration

  • Leesbaar voor niet-technische stakeholders
  • Eenvoudige code reviews
  • Version control friendly

DevOps Integration

  • Standaard in alle major DevOps tools
  • GitOps workflow compatibel
  • Automation ready

Safety & Validation

  • Schema validation mogelijk
  • Type safety met tools
  • Security scanning integration

Scalability

  • Handelt complexe configuraties
  • Modulariteit via anchors en aliases
  • Reusability van configuratie

2. YAML Basis Syntax

YAML Syntax Fundamenten

YAML gebruikt indentatie (spaties) voor structuur en heeft een minimalistische syntax zonder speciale karakters zoals curly braces of brackets.

Basis Syntax Regels

# ========== COMMENTAAR ==========
# Comments beginnen met hash (#) en gaan tot einde regel
# Dit is een comment in YAML

# ========== KEY-VALUE PAIRS ==========
# Eenvoudige key-value pairs
naam: "Jan Jansen"
leeftijd: 30
actief: true
salaris: 55000.50

# ========== INDENTATIE ==========
# Gebruik SPATIES (geen tabs!) voor indentatie
# Standaard: 2 spaties per niveau
persoon:
  naam: "Marie de Vries"
  adres:
    straat: "Hoofdstraat 123"
    postcode: "1234 AB"
    stad: "Amsterdam"

# ========== LISTS/ARRAYS ==========
# Lists met dash (-) syntax
boodschappen:
  - "melk"
  - "eieren"
  - "brood"
  - "kaas"

# List van objects
gebruikers:
  - naam: "user1"
    email: "user1@example.com"
  - naam: "user2"
    email: "user2@example.com"

# ========== MULTILINE STRINGS ==========
# Pipe (|) behoudt newlines
beschrijving: |
  Dit is een multiline string
  Die nieuwe regels behoudt.
  Handig voor lange beschrijvingen.

# Greater than (>) vouwt newlines tot spaties
samenvatting: >
  Dit is een lange zin die
  wordt omgevouwen tot één
  enkele regel in output.

# ========== SPECIALE KARAKTERS ==========
# Quotes voor speciale karakters
speciale_string: "Waarde met: dubbele punt, en andere tekens"
boolean_string: "true"  # String, niet boolean!
getal_string: "123"     # String, niet number!

# ========== NULL WAARDEN ==========
lege_waarde: null  # Expliciete null
ook_leeg:        # Impliciete null (geen waarde)

Complexe Structuren

# ========== NESTED STRUCTURES ==========
organisatie:
  naam: "DataPartner365"
  oprichtingsjaar: 2020
  medewerkers:
    - naam: "Lisa"
      rol: "Data Engineer"
      skills:
        - "Python"
        - "SQL"
        - "Apache Spark"
    - naam: "Mark"
      rol: "Data Scientist"
      skills:
        - "Machine Learning"
        - "R"
        - "Statistics"

# ========== INLINE SYNTAX (FLOW STYLE) ==========
# Gebruik voor eenvoudige structuren
inline_list: ["item1", "item2", "item3"]
inline_map: {naam: "Jan", leeftijd: 30, actief: true}

# Gemengde inline en block style
configuratie:
  database:
    host: "localhost"
    port: 5432
    tables: ["users", "products", "orders"]

# ========== MULTIDOCUMENT FILES ==========
# Drie streepjes (---) scheiden documenten
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  log_level: INFO

---
apiVersion: v1
kind: Secret
metadata:
  name: database-secret
type: Opaque
data:
  username: YWRtaW4=  # base64 encoded

# ========== DIRECTIVES ==========
%YAML 1.2  # YAML versie
%TAG ! tag:example.com,2025:  # Custom tag prefix
---
complex_value: !CustomType
  field1: "value1"
  field2: "value2"

3. Data Typen en Structuren

YAML Data Types

YAML ondersteunt verschillende data typen die automatisch worden herkend of expliciet kunnen worden gespecificeerd.

Scalar Types (Enkele waarden)

# ========== STRINGS ==========
# Plain (unquoted) strings
naam: Jan Jansen
stad: Amsterdam

# Single quoted strings (preserve alles letterlijk)
pad: 'C:\Users\Jan\Documents'
speciaal: 'Dit is een single ''quote'' binnen quotes'

# Double quoted strings (escape sequences)
beschrijving: "Regel 1\nRegel 2\tTab"
unicode: "Euro teken: \u20AC"

# ========== NUMBERS ==========
# Integers
aantal: 42
negatief: -100
octal: 0o755      # Octal: 493 in decimal
hex: 0xFF        # Hex: 255 in decimal

# Floating point
prijs: 19.99
wetenschappelijk: 6.022e23  # 6.022 × 10²³
oneindig: .inf    # Infinity
not_a_number: .nan  # Not a Number

# ========== BOOLEANS ==========
actief: true
uitgeschakeld: false

# Alternatieve boolean notaties
ja: yes      # Wordt omgezet naar true
nee: no       # Wordt omgezet naar false
aan: on       # Wordt omgezet naar true
uit: off      # Wordt omgezet naar false

# ========== NULL ==========
leeg: null
ook_null: Null    # Case insensitive
nog_null: NULL   # Case insensitive
tilde: ~        # Alternate null notation

# ========== DATES EN TIMES ==========
# ISO 8601 format
datum: 2025-12-20
datetime: 2025-12-20T15:30:00Z
datetime_local: 2025-12-20T15:30:00+01:00
tijd: 15:30:00

# ========== TYPE CONVERSION ==========
# Explicit type tags
string_getal: !!str 123        # Wordt "123"
getal_string: !!int "456"      # Wordt 456
float_explicit: !!float "7.89" # Wordt 7.89
boolean_explicit: !!bool "true" # Wordt true

Collection Types

# ========== SEQUENCES (LISTS/ARRAYS) ==========
# Block style sequence
programmeertalen:
  - Python
  - JavaScript
  - Java
  - Go

# Flow style sequence (inline)
databases: [PostgreSQL, MySQL, MongoDB, Redis]

# Nested sequences
matrix:
  - [1, 2, 3]
  - [4, 5, 6]
  - [7, 8, 9]

# ========== MAPPINGS (DICTIONARIES/OBJECTS) ==========
# Block style mapping
database_config:
  host: localhost
  port: 5432
  username: admin
  password: secret123

# Flow style mapping (inline)
user: {name: Jan, age: 30, active: true}

# Complex nested structure
application:
  name: MyApp
  version: "1.0.0"
  environments:
    - name: development
      database:
        host: dev-db.local
        port: 5432
      features:
        - debug_mode
        - hot_reload
    - name: production
      database:
        host: prod-db.company.com
        port: 5432
      features:
        - monitoring
        - backup

# ========== SETS ==========
# Unieke waardes (YAML 1.2)
unieke_items: !!set
  ? item1
  ? item2
  ? item3

# ========== OMAPPING (ORDERED MAP) ==========
# Behoudt volgorde van keys
geordende_config: !!omap
  - stap1: init
  - stap2: validate
  - stap3: process
  - stap4: cleanup

# ========== PAIRS ==========
# Behoudt duplicate keys
duplicate_keys: !!pairs
  - taal: Nederlands
  - taal: English
  - taal: Français

4. Geavanceerde YAML Features

Krachtige YAML Features

Leer geavanceerde YAML features die configuratie herbruikbaar en dynamisch maken.

Anchors en Aliases (Hergebruik)

# ========== ANCHORS (&) ==========
# Definieer herbruikbare template met anchor
database_defaults: &database_defaults
  host: localhost
  port: 5432
  timeout: 30
  ssl: true

# ========== ALIASES (*) ==========
# Verwijs naar anchor met alias
development_db:
  <<: *database_defaults
  name: dev_database
  username: dev_user

production_db:
  <<: *database_defaults
  name: prod_database
  host: prod.db.example.com
  username: prod_user
  timeout: 60  # Override default

# ========== MULTIPLE ANCHORS ==========
common_settings: &common
  environment: production
  region: eu-west-1

app_settings: &app
  version: "1.0.0"
  debug: false

# Combineer multiple anchors
full_config:
  <<: [*common, *app]
  specific_setting: custom_value

# ========== NESTED ANCHORS ==========
base_service: &base_service
  image: &base_image nginx:latest
  ports:
    - &port_80 80:80

web_service:
  <<: *base_service
  environment:
    - NGINX_ENV=production

api_service:
  image: *base_image
  ports:
    - *port_80
    - 443:443

# ========== MERGE KEY (<<) ==========
# Officiële YAML merge key (niet alle parsers)
defaults: &defaults
  adapter: postgresql
  encoding: unicode

development:
  <<: *defaults
  database: dev_db

# Complex merge voorbeeld
base: &base
  name: base
  overrides:
    key1: value1

extended:
  <<: *base
  name: extended  # Override name
  overrides:
    <<: *base.overrides  # Merge nested
    key2: value2       # Add new key

Custom Tags en Templates

# ========== CUSTOM TAGS ==========
%TAG ! tag:example.com,2025:
---
# Gebruik custom tag
my_object: !CustomType
  field1: value1
  field2: value2

# ========== MULTILINE STRINGS ==========
# Literal scalar (behoud newlines)
script: |
  #!/bin/bash
  echo "Hello World"
  date
  whoami

# Folded scalar (vouw newlines)
paragraph: >
  Dit is een hele lange paragraaf die
  over meerdere regels geschreven is maar
  wordt omgevouwen tot één regel.

# Keep trailing newlines (|+)
with_trailing_newlines: |+
  Line 1
  Line 2

# Strip trailing newlines (|-)
strip_newlines: |-
  Line 1
  Line 2

# ========== COMMENTS IN MULTILINE ==========
commented_script: |
  #!/bin/bash
  # Dit is een comment in een script
  echo "Start script"
  
  # Main logic
  for i in {1..10}; do
    echo "Iteration $i"
  done

# ========== TEMPLATE VARIABLES ==========
# Sommige tools ondersteunen template syntax
# Bijvoorbeeld in Ansible of Helm
config:
  app_name: "{{ .Values.app.name }}"
  replicas: "{{ .Values.replicaCount }}"
  environment: "{{ .Values.environment | default "production" }}"

# ========== CONDITIONAL YAML ==========
# Niet native in YAML, maar toolspecifiek
# Bijvoorbeeld Azure DevOps conditions
stages:
- stage: Build
  condition: succeeded()
  
- stage: Test
  condition: and(succeeded(), eq(variables['RunTests'], 'true'))

# ========== YAML DIRECTIVES ==========
%YAML 1.2
%TAG !e! tag:example.com,2025:
---
document1:
  key: value

---
%TAG !t! tag:test.com,2025:
document2:
  custom: !t!special value

Kubernetes Experts nodig?

Vind ervaren DevOps Engineers gespecialiseerd in YAML en cloud-native technologieën

5. YAML in Kubernetes

Kubernetes Resource Definitions

Kubernetes gebruikt YAML als primair formaat voor alle resource definitions. Hier zijn essentiële patronen.

Basic Kubernetes Resources

# ========== POD DEFINITION ==========
apiVersion: v1
kind: Pod
metadata:
  name: nginx-pod
  labels:
    app: nginx
    environment: development
spec:
  containers:
  - name: nginx-container
    image: nginx:1.21
    ports:
    - containerPort: 80
    env:
    - name: ENVIRONMENT
      value: dev
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"

# ========== DEPLOYMENT ==========
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.21-alpine
        ports:
        - containerPort: 80
        livenessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 5

# ========== SERVICE ==========
apiVersion: v1
kind: Service
metadata:
  name: nginx-service
spec:
  selector:
    app: nginx
  ports:
  - name: http
    protocol: TCP
    port: 80
    targetPort: 80
  type: LoadBalancer

# ========== CONFIGMAP ==========
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
  namespace: default
data:
  log_level: INFO
  database_url: postgresql://localhost:5432/mydb
  app_settings: |
    feature.enabled=true
    cache.timeout=300
    api.rate_limit=100

# ========== SECRET ==========
apiVersion: v1
kind: Secret
metadata:
  name: database-secret
type: Opaque
data:
  username: YWRtaW4=  # admin in base64
  password: c2VjcmV0MTIz  # secret123 in base64

Advanced Kubernetes Patterns

# ========== HELM TEMPLATES ==========
# Helm gebruikt Go templates in YAML
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ .Chart.Name }}-deployment
  labels:
    {{- include "mychart.labels" . | nindent 4 }}
spec:
  replicas: {{ .Values.replicaCount }}
  selector:
    matchLabels:
      {{- include "mychart.selectorLabels" . | nindent 6 }}
  template:
    metadata:
      labels:
        {{- include "mychart.selectorLabels" . | nindent 8 }}
    spec:
      containers:
        - name: {{ .Chart.Name }}
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
          imagePullPolicy: {{ .Values.image.pullPolicy }}
          ports:
            - name: http
              containerPort: {{ .Values.service.port }}
              protocol: TCP

# ========== KUSTOMIZE OVERLAYS ==========
# base/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - deployment.yaml
  - service.yaml
commonLabels:
  app: myapp
  managed-by: kustomize

# overlays/production/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - ../../base
namespace: production
replicas:
  - name: myapp-deployment
    count: 5
patchesStrategicMerge:
  - increase-resources.yaml
configMapGenerator:
  - name: app-config
    behavior: merge
    literals:
      - ENVIRONMENT=production
      - LOG_LEVEL=WARN

# ========== CRDs (CUSTOM RESOURCE DEFINITIONS) ==========
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databases.example.com
spec:
  group: example.com
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                databaseName:
                  type: string
                size:
                  type: string
                  enum: [small, medium, large]
  scope: Namespaced
  names:
    plural: databases
    singular: database
    kind: Database
    shortNames:
    - db

# ========== MULTI-DOCUMENT KUBERNETES FILE ==========
# Meerdere resources in één YAML file
---
apiVersion: v1
kind: Namespace
metadata:
  name: myapp-production

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
  namespace: myapp-production
data:
  environment: production

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-deployment
  namespace: myapp-production
spec:
  replicas: 3
  template:
    # ... pod template ...

6. YAML voor CI/CD Pipelines

Pipeline Configuration

Moderne CI/CD tools gebruiken YAML voor pipeline definitions. Hier zijn voorbeelden voor populaire platforms.

GitHub Actions Workflow

# .github/workflows/ci.yml
name: CI Pipeline

on:
  push:
    branches: [ main, develop ]
  pull_request:
    branches: [ main ]

env:
  NODE_VERSION: '18'
  PYTHON_VERSION: '3.11'

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        node-version: ['16', '18', '20']
    steps:
    - uses: actions/checkout@v3
    
    - name: Setup Node.js
      uses: actions/setup-node@v3
      with:
        node-version: ${{ matrix.node-version }}
        cache: 'npm'
    
    - name: Install dependencies
      run: npm ci
    
    - name: Run tests
      run: npm test
    
    - name: Run linting
      run: npm run lint

  build:
    needs: test
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v3
    
    - name: Set up Docker Buildx
      uses: docker/setup-buildx-action@v2
    
    - name: Login to DockerHub
      uses: docker/login-action@v2
      with:
        username: ${{ secrets.DOCKER_USERNAME }}
        password: ${{ secrets.DOCKER_PASSWORD }}
    
    - name: Build and push
      uses: docker/build-push-action@v4
      with:
        context: .
        push: true
        tags: |
          ${{ secrets.DOCKER_USERNAME }}/myapp:latest
          ${{ secrets.DOCKER_USERNAME }}/myapp:${{ github.sha }}

  deploy:
    needs: build
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    environment: production
    steps:
    - uses: actions/checkout@v3
    
    - name: Configure AWS credentials
      uses: aws-actions/configure-aws-credentials@v1
      with:
        aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
        aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
        aws-region: eu-west-1
    
    - name: Deploy to EKS
      run: |
        aws eks update-kubeconfig --name my-cluster
        kubectl apply -f k8s/
        kubectl rollout status deployment/myapp-deployment

Azure DevOps Pipeline

# azure-pipelines.yml
trigger:
  branches:
    include:
    - main
    - develop
  paths:
    exclude:
    - README.md
    - docs/*

variables:
  - name: buildConfiguration
    value: Release
  - name: dockerRegistry
    value: myregistry.azurecr.io
  - group: production-variables

stages:
- stage: Build
  displayName: Build stage
  jobs:
  - job: BuildAndTest
    pool:
      vmImage: ubuntu-latest
    steps:
    - task: Docker@2
      displayName: Build Docker image
      inputs:
        containerRegistry: $(dockerRegistry)
        repository: myapp
        command: build
        Dockerfile: Dockerfile
        tags: |
          $(Build.BuildId)
          latest
    
    - task: DotNetCoreCLI@2
      displayName: Run unit tests
      inputs:
        command: test
        projects: '**/*Tests.csproj'
        arguments: '--configuration $(buildConfiguration)'
    
    - task: PublishTestResults@2
      displayName: Publish test results
      inputs:
        testResultsFormat: VSTest
        testResultsFiles: '**/*.trx'

- stage: DeployToStaging
  displayName: Deploy to staging
  dependsOn: Build
  condition: succeeded()
  jobs:
  - job: Deploy
    pool:
      vmImage: ubuntu-latest
    steps:
    - task: KubernetesManifest@0
      displayName: Deploy to Kubernetes
      inputs:
        action: deploy
        namespace: staging
        manifests: |
          $(Build.SourcesDirectory)/manifests/deployment.yaml
          $(Build.SourcesDirectory)/manifests/service.yaml
        imagePullSecrets: registry-secret

- stage: DeployToProduction
  displayName: Deploy to production
  dependsOn: DeployToStaging
  condition: and(succeeded(), eq(variables['DeployProd'], 'true'))
  jobs:
  - job: DeployProd
    pool:
      vmImage: ubuntu-latest
    steps:
    - task: AzureKeyVault@1
      displayName: Get production secrets
      inputs:
        azureSubscription: 'Production Subscription'
        KeyVaultName: 'prod-keyvault'
        SecretsFilter: '*'
    
    - task: Kubernetes@1
      displayName: Deploy to production
      inputs:
        connectionType: Azure Resource Manager
        azureSubscriptionEndpoint: 'Production Subscription'
        azureResourceGroup: 'prod-rg'
        kubernetesCluster: 'prod-aks'
        command: apply
        arguments: -f manifests/ -n production

7. YAML in Data Engineering

Data Pipeline Configuration

Data engineering tools gebruiken YAML voor workflow definitions, pipeline configuraties en data transformations.

Apache Airflow DAG Definition

# dags/data_pipeline.py (YAML in Python string)
from datetime import datetime, timedelta
from airflow import DAG
from airflow.operators.python import PythonOperator
from airflow.operators.bash import BashOperator
from airflow.providers.docker.operators.docker import DockerOperator

# DAG definition in YAML-like structure via code
default_args = {
    'owner': 'data_engineering',
    'depends_on_past': False,
    'email_on_failure': True,
    'email_on_retry': False,
    'retries': 3,
    'retry_delay': timedelta(minutes=5),
    'start_date': datetime(2025, 1, 1),
}

dag = DAG(
    'data_processing_pipeline',
    default_args=default_args,
    description='ETL pipeline for customer data',
    schedule_interval='0 2 * * *',  # Dagelijks om 2:00
    catchup=False,
    tags=['etl', 'data-warehouse', 'daily'],
)

# Airflow 2.0+ TaskFlow API met YAML config
with DAG(
    'modern_data_pipeline',
    default_args=default_args,
    schedule_interval='@daily',
) as dag:
    
    @dag.task
    def extract():
        """Extract data from source systems."""
        return {'status': 'extracted', 'rows': 1000}
    
    @dag.task
    def transform(data: dict):
        """Transform extracted data."""
        return {'status': 'transformed', 'rows': data['rows']}
    
    @dag.task
    def load(data: dict):
        """Load transformed data to warehouse."""
        return {'status': 'loaded', 'rows': data['rows']}
    
    # Task dependencies
    extracted = extract()
    transformed = transform(extracted)
    load(transformed)

# ========== DOCKER COMPOSE VOOR AIRFLOW ==========
# docker-compose.yaml
version: '3.8'
services:
  postgres:
    image: postgres:13
    environment:
      - POSTGRES_USER=airflow
      - POSTGRES_PASSWORD=airflow
      - POSTGRES_DB=airflow
    volumes:
      - postgres-db-volume:/var/lib/postgresql/data

  airflow-webserver:
    image: apache/airflow:2.7.1
    environment:
      - AIRFLOW__CORE__EXECUTOR=LocalExecutor
      - AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@postgres/airflow
      - AIRFLOW__CORE__LOAD_EXAMPLES=false
    volumes:
      - ./dags:/opt/airflow/dags
      - ./logs:/opt/airflow/logs
      - ./plugins:/opt/airflow/plugins
    ports:
      - "8080:8080"
    command: webserver
    depends_on:
      - postgres

  airflow-scheduler:
    image: apache/airflow:2.7.1
    environment:
      - AIRFLOW__CORE__EXECUTOR=LocalExecutor
      - AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@postgres/airflow
    volumes:
      - ./dags:/opt/airflow/dags
      - ./logs:/opt/airflow/logs
      - ./plugins:/opt/airflow/plugins
    command: scheduler
    depends_on:
      - postgres
      - airflow-webserver

volumes:
  postgres-db-volume:

dbt (Data Build Tool) Project Configuration

# dbt_project.yml
name: my_dbt_project
version: '1.0.0'
config-version: 2

profile: my_dbt_project

model-paths: ["models"]
seed-paths: ["data"]
test-paths: ["tests"]
analysis-paths: ["analyses"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

target-path: "target"  # directory for compiled files
clean-targets:         # directories to be removed by `dbt clean`
  - "target"
  - "dbt_packages"

# Models configuration
models:
  my_dbt_project:
    # Applies to all files under models/example/
    example:
      materialized: table
      schema: analytics
      
      # Subdirectory-specific config
      staging:
        materialized: view
        schema: staging
        tags: ["staging"]
        
        stg_customers:
          materialized: ephemeral
          tags: ["customers", "staging"]
          
      marts:
        materialized: table
        schema: marts
        tags: ["marts"]
        
        finance:
          materialized: incremental
          unique_key: id
          tags: ["finance", "marts"]

# Seeds configuration
seeds:
  my_dbt_project:
    schema: raw_data
    quote_columns: false
    
    country_codes:
      +column_types:
        country_code: varchar(2)
        country_name: varchar(100)

# Tests configuration
tests:
  my_dbt_project:
    data_quality:
      # Custom test configurations
      unique:
        severity: warn
      not_null:
        severity: error
      accepted_values:
        severity: warn

# Variables for Jinja templating
vars:
  start_date: "2025-01-01"
  currency: EUR
  timezone: Europe/Amsterdam

# Environment-specific overrides
# profiles.yml (apart bestand)
my_dbt_project:
  target: dev
  outputs:
    dev:
      type: snowflake
      account: myaccount.europe-west1.gcp
      user: "{{ env_var('DBT_USER') }}"
      password: "{{ env_var('DBT_PASSWORD') }}"
      role: transformer
      database: analytics_dev
      warehouse: transforming
      schema: dbt
      threads: 4
    
    prod:
      type: snowflake
      account: myaccount.europe-west1.gcp
      user: "{{ env_var('DBT_PROD_USER') }}"
      password: "{{ env_var('DBT_PROD_PASSWORD') }}"
      role: transformer
      database: analytics_prod
      warehouse: transforming
      schema: dbt
      threads: 8

8. YAML Best Practices

Essentiële Richtlijnen

Volg deze best practices voor schone, onderhoudbare en veilige YAML configuraties.

Formatting

  • Gebruik 2 spaties voor indentatie
  • Nooit tabs gebruiken
  • Consistente line length (max 80-100)
  • Trailing spaces vermijden

Structure

  • Group gerelateerde keys samen
  • Alphabetische volgorde voor niet-gerelateerde keys
  • Comments voor complexe secties
  • Anchors voor hergebruik

Security

  • Geen secrets in plain text
  • Environment variables voor gevoelige data
  • Validation met schema
  • Git ignore voor lokale config

Maintenance

  • Version control voor alle config files
  • Changes via Pull Requests
  • Linting in CI/CD pipeline
  • Documentatie bijwijzen

Praktische Tips

# TIP 1: Use anchors for common configurations
base_config: &base_config
  timeout: 30
  retries: 3
  log_level: INFO

service_a:
  <<: *base_config
  name: service-a
  port: 8080

service_b:
  <<: *base_config
  name: service-b
  port: 8081

# TIP 2: Keep YAML files focused
# ❌ SLECHT: Alles in één groot bestand
# ✅ GOED: Gesplitst per functionaliteit
# config/
# ├── database.yml
# ├── api.yml
# ├── logging.yml
# └── monitoring.yml

# TIP 3: Use multi-document YAML voor Kubernetes
# resource.yaml:
---
apiVersion: v1
kind: ConfigMap
# ... config map spec

---
apiVersion: apps/v1
kind: Deployment
# ... deployment spec

---
apiVersion: v1
kind: Service
# ... service spec

# TIP 4: Avoid complex nesting
# ❌ SLECHT: Te diepe nesting
config:
  section1:
    subsection1:
      subsubsection1:
        value: moeilijk te lezen

# ✅ GOED: Flat structure
section1_subsection1_subsubsection1_value: makkelijk te lezen

# TIP 5: Use proper string formatting
# Voor paths, URLs, complexe strings: quotes
path: "C:\\Program Files\\MyApp"
url: "https://api.example.com/v1/resource"
regex: "^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$"

# TIP 6: Document with comments
# Waarom, niet wat (wat is duidelijk uit de code)
retry_count: 3  # 3 retries omdat externe API soms unstable is
timeout: 5000  # 5 seconden timeout voor mobile gebruikers

9. Tools en Validatie

YAML Tooling Ecosystem

Essentiële tools voor het werken met YAML: validatie, linting, formatting en meer.

Validatie Tools

  • yamllint: Linting met custom rules
  • yq: YAML processor (jq voor YAML)
  • kubeval: Kubernetes manifest validatie
  • kube-score: Kubernetes best practices

Editors & IDEs

  • VS Code: YAML extension + schemas
  • IntelliJ IDEA: Built-in YAML support
  • Vim/Neovim: vim-yaml plugin
  • Sublime Text: Package Control plugins

CLI Tools

  • yq: Query en transform YAML
  • yaml2json/json2yaml: Convertie tools
  • helm: Kubernetes package manager
  • kustomize: Kubernetes customization

Security Tools

  • checkov: Infrastructure as Code scanning
  • trivy: Vulnerability scanner
  • gitleaks: Secret detection
  • yaml-scan: Security linting

Praktische Tool Commando's

# ========== YAMLLINT ==========
# Installatie
pip install yamllint

# Configuratie: .yamllint.yaml
extends: default
rules:
  line-length:
    max: 100
    allow-non-breakable-words: true
    allow-non-breakable-inline-mappings: true
  trailing-spaces: enable
  document-start: disable  # Niet nodig voor Kubernetes
  truthy:
    allowed-values: ['true', 'false']

# Lint een file
yamllint config.yaml
yamllint -c .yamllint.yaml k8s/

# ========== YQ ==========
# Installatie
brew install yq  # macOS
snap install yq  # Linux

# Basis queries
yq eval '.metadata.name' deployment.yaml
yq eval '.spec.replicas = 5' -i deployment.yaml  # In-place edit
yq eval-all 'select(.kind == "Deployment")' *.yaml

# Complexe queries
yq eval '.spec.template.spec.containers[].env[] | select(.name == "DB_HOST")' deploy.yaml
yq eval '.data | keys' configmap.yaml

# ========== KUBEVAL ==========
# Validatie van Kubernetes manifests
kubeval deployment.yaml
kubeval --strict --schema-location https://raw.githubusercontent.com/yannh/kubernetes-json-schema/master/ manifests/

# ========== VS CODE SETTINGS ==========
# settings.json
{
  "yaml.schemas": {
    "kubernetes": "*.yaml",
    "https://json.schemastore.org/github-workflow.json": ".github/workflows/*",
    "https://json.schemastore.org/docker-compose.json": "docker-compose*.yaml"
  },
  "yaml.customTags": [
    "!Ref sequence",
    "!GetAtt sequence",
    "!Join sequence"
  ],
  "yaml.format.enable": true,
  "yaml.format.singleQuote": true,
  "yaml.format.bracketSpacing": true
}

# ========== PRECOMMIT HOOKS ==========
# .pre-commit-config.yaml
repos:
  - repo: https://github.com/adrienverge/yamllint
    rev: v1.32.0
    hooks:
      - id: yamllint
        args: [--strict, --config-file, .yamllint.yaml]
  
  - repo: https://github.com/digitalpulp/pre-commit-yaml
    rev: v1.0.0
    hooks:
      - id: yamlfmt
  
  - repo: https://github.com/instrumenta/kubeval
    rev: v0.16.1
    hooks:
      - id: kubeval
        files: ^k8s/.*\.yaml$

10. YAML vs Alternatives (JSON, TOML, HCL)

Configuration Format Comparison

Wanneer gebruik je YAML, JSON, TOML of HCL? Elk formaat heeft zijn sterke punten.

Formaat Sterke Punten Zwakke Punten Best Voor
YAML Human readable, comments, anchors, complex types Indentation errors, complex parsing, security concerns Kubernetes, CI/CD, config files, complex structures
JSON Universal support, simple parsing, typed No comments, verbose, hard to read/write APIs, data exchange, simple config
TOML Simple, explicit, good for flat configs Limited nesting, less tooling App config (Rust/Cargo), simple key-value
HCL Expressive, variables, functions, Terraform Terraform-specific, learning curve Infrastructure as Code (Terraform)
XML Strict schema validation, namespaces Verbose, complex, outdated Legacy systems, SOAP APIs

Vergelijkende Voorbeelden

# ========== ZELFDE CONFIG IN VERSCHILLENDE FORMATEN ==========

# YAML
database:
  host: localhost
  port: 5432
  credentials:
    username: admin
    password: secret123  # TODO: Move to env var

# JSON
{
  "database": {
    "host": "localhost",
    "port": 5432,
    "credentials": {
      "username": "admin",
      "password": "secret123"
    }
  }
}

# TOML
[database]
host = "localhost"
port = 5432

[database.credentials]
username = "admin"
password = "secret123"

# HCL (Terraform)
database {
  host = "localhost"
  port = 5432
  
  credentials {
    username = "admin"
    password = var.database_password  # Variable reference
  }
}

# ========== WANNEER WELK FORMAAT? ==========

# Gebruik YAML als:
# • Human readability belangrijk is
# • Je comments nodig hebt
# • Complexe nesting nodig is
# • Je werkt met Kubernetes/CI/CD

# Gebruik JSON als:
# • Machine-to-machine communicatie
# • Simpele configuraties
# • JavaScript/Web context

# Gebruik TOML als:
# • Flat configuration structure
# • Rust/Cargo projects
# • Simplicity over features

# Gebruik HCL als:
# • Infrastructure as Code (Terraform)
# • Je variables en functions nodig hebt
# • Complexe infrastructure config

11. Praktische Voorbeelden

Real-World YAML Voorbeelden

Complete, werkende YAML configuraties voor veelvoorkomende use cases.

Complete Docker Compose Setup

# docker-compose.yml voor full-stack applicatie
version: '3.8'

services:
  # Frontend
  frontend:
    build:
      context: ./frontend
      dockerfile: Dockerfile
      args:
        NODE_ENV: development
    ports:
      - "3000:3000"
    volumes:
      - ./frontend:/app
      - /app/node_modules
    environment:
      - REACT_APP_API_URL=http://api:8000
      - REACT_APP_ENV=development
    depends_on:
      - api
    networks:
      - app-network

  # Backend API
  api:
    build:
      context: ./backend
      dockerfile: Dockerfile
    ports:
      - "8000:8000"
    volumes:
      - ./backend:/app
    environment:
      - DATABASE_URL=postgresql://user:password@db:5432/mydb
      - REDIS_URL=redis://redis:6379
      - DEBUG=true
    depends_on:
      - db
      - redis
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
    networks:
      - app-network

  # Database
  db:
    image: postgres:15-alpine
    environment:
      - POSTGRES_USER=user
      - POSTGRES_PASSWORD=password
      - POSTGRES_DB=mydb
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql
    ports:
      - "5432:5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U user"]
      interval: 10s
      timeout: 5s
      retries: 5
    networks:
      - app-network

  # Redis
  redis:
    image: redis:7-alpine
    command: redis-server --appendonly yes
    volumes:
      - redis_data:/data
    ports:
      - "6379:6379"
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5
    networks:
      - app-network

  # Monitoring (Prometheus + Grafana)
  prometheus:
    image: prom/prometheus:latest
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    ports:
      - "9090:9090"
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
    networks:
      - app-network

  grafana:
    image: grafana/grafana:latest
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    volumes:
      - grafana_data:/var/lib/grafana
      - ./grafana/provisioning:/etc/grafana/provisioning
    ports:
      - "3001:3000"
    depends_on:
      - prometheus
    networks:
      - app-network

volumes:
  postgres_data:
  redis_data:
  prometheus_data:
  grafana_data:

networks:
  app-network:
    driver: bridge

Complete GitHub Actions Workflow

# .github/workflows/full-pipeline.yml
name: Full CI/CD Pipeline

on:
  push:
    branches: [ main ]
  pull_request:
    branches: [ main ]
  schedule:
    - cron: '0 2 * * *'  # Dagelijks om 2:00

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  quality-checks:
    runs-on: ubuntu-latest
    steps:
    - name: Checkout code
      uses: actions/checkout@v3
    
    - name: Setup Python
      uses: actions/setup-python@v4
      with:
        python-version: '3.11'
    
    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
        pip install -r requirements-dev.txt
    
    - name: Lint code
      run: |
        flake8 .
        black --check .
        isort --check-only .
    
    - name: Security scan
      run: |
        bandit -r .
        safety check
    
    - name: YAML validation
      run: |
        yamllint .
        kubeval k8s/ --strict

  test:
    needs: quality-checks
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ['3.9', '3.10', '3.11']
    steps:
    - name: Checkout code
      uses: actions/checkout@v3
    
    - name: Setup Python
      uses: actions/setup-python@v4
      with:
        python-version: ${{ matrix.python-version }}
    
    - name: Install dependencies
      run: |
        pip install -r requirements.txt
        pip install -r requirements-test.txt
    
    - name: Run tests
      run: |
        pytest --cov=./ --cov-report=xml
    
    - name: Upload coverage
      uses: codecov/codecov-action@v3
      with:
        file: ./coverage.xml
        fail_ci_if_error: true

  build-and-push:
    needs: test
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
    steps:
    - name: Checkout code
      uses: actions/checkout@v3
    
    - name: Set up Docker Buildx
      uses: docker/setup-buildx-action@v2
    
    - name: Log in to Container Registry
      uses: docker/login-action@v2
      with:
        registry: ${{ env.REGISTRY }}
        username: ${{ github.actor }}
        password: ${{ secrets.GITHUB_TOKEN }}
    
    - name: Extract metadata
      id: meta
      uses: docker/metadata-action@v4
      with:
        images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
        tags: |
          type=sha,prefix=,suffix=-{{date 'YYYYMMDD'}}-{{sha}}
          type=ref,event=branch
          type=ref,event=pr
          type=semver,pattern={{version}}
          type=semver,pattern={{major}}.{{minor}}
    
    - name: Build and push
      uses: docker/build-push-action@v4
      with:
        context: .
        push: ${{ github.event_name != 'pull_request' }}
        tags: ${{ steps.meta.outputs.tags }}
        labels: ${{ steps.meta.outputs.labels }}
        cache-from: type=gha
        cache-to: type=gha,mode=max

  deploy-staging:
    needs: build-and-push
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    environment: staging
    steps:
    - name: Checkout code
      uses: actions/checkout@v3
    
    - name: Deploy to staging
      run: |
        echo "Deploying to staging..."
        kubectl config use-context staging
        kubectl apply -f k8s/staging/
        kubectl rollout status deployment/app-deployment -n staging

  deploy-production:
    needs: deploy-staging
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    environment: production
    steps:
    - name: Checkout code
      uses: actions/checkout@v3
    
    - name: Manual approval
      uses: trstringer/manual-approval@v1
      with:
        secret: ${{ github.token }}
        approvers: team-leads
        minimum-approvals: 1
    
    - name: Deploy to production
      run: |
        echo "Deploying to production..."
        kubectl config use-context production
        kubectl apply -f k8s/production/
        kubectl rollout status deployment/app-deployment -n production

Conclusie

YAML is een krachtig en essentieel formaat voor moderne software ontwikkeling, DevOps en data engineering. Zijn leesbaarheid, flexibiliteit en uitgebreide tooling ondersteuning maken het de standaard keuze voor configuration as code in cloud-native ecosystemen.

Key Takeaways:

  • YAML's kracht ligt in human readability en complexe data structuren
  • Gebruik anchors en aliases voor herbruikbare configuraties
  • Altools zoals yamllint en yq om kwaliteit te waarborgen
  • Vermijd secrets in YAML files - gebruik environment variables of secret managers
  • Kies het juiste formaat voor de job: YAML voor configuratie, JSON voor APIs, HCL voor infrastructure

Ons Advies:

Begin met YAML als je: werkt met Kubernetes, CI/CD pipelines, data engineering tools of complexe configuraties nodig hebt. De learning curve is mild en de voordelen zijn significant voor team collaboration en maintainability.

Blijf leren en oefenen: De echte kracht van YAML komt naar voren in geavanceerde features zoals templates, anchors en multi-document files. Start met eenvoudige configuraties en bouw geleidelijk aan complexiteit op.

DevOps Team nodig?

Plaats je vacature en vind experts in YAML, Kubernetes en cloud-native technologieën