DataPartner365

Jouw partner voor datagedreven groei en inzichten

Data Modeling: Complete Gids voor Database Design

Laatst bijgewerkt: 20 december 2025
Leestijd: 35 minuten
Data Modeling, Database Design, ERD, Normalisatie, Data Architectuur

Leer data modeling van conceptueel naar fysiek model. Master ERD, normalisatie en database design best practices voor schaalbare data oplossingen.

Zoek je Data Modelers?

Vind ervaren Data Architects en Database Designers voor je data projecten

1. Wat is Data Modeling?

Data Modeling Definitie

Data Modeling is het proces van het creëren van een visuele representatie van een informatiesysteem om de data elementen en hun onderlinge relaties te communiceren. Het is de blauwdruk voor database design.

Visualisatie

Creëer duidelijke diagrammen van data structuren

Relaties

Definieer connecties tussen verschillende data entiteiten

Structuur

Ontwerp georganiseerde en efficiënte data opslag

Implementatie

Vertaal modellen naar werkende database schema's

Data Modeling Type Doel Doelgroep Output
Conceptueel Business requirements begrijpen Business stakeholders High-level entiteiten en relaties
Logisch Data structuren ontwerpen Data architects, analysts Gedetailleerd model met attributen
Fysiek Database implementatie Database administrators, developers SQL DDL scripts, indexes, constraints

Waarom Data Modeling belangrijk is

Data Quality

  • Voorkomt data inconsistencies
  • Zorgt voor data integriteit
  • Reduceert data redundancy

Performance

  • Optimaliseert query performance
  • Reduceert storage requirements
  • Verbetert scalability

Collaboration

  • Faciliteert communicatie
  • Documenteert data structuren
  • Ondersteunt knowledge transfer

Maintenance

  • Vereenvoudigt wijzigingen
  • Ondersteunt data governance
  • Faciliteert troubleshooting

2. Drie lagen van Data Modeling

Drie-tier Data Modeling Approach

Data modeling gebeurt typisch in drie fasen, elk met toenemend detailniveau:

  1. Conceptueel Model: Business requirements en high-level entiteiten
  2. Logisch Model: Gedetailleerde data structuren en relaties
  3. Fysiek Model: Database-specifieke implementatie details

Data Modeling Workflow

Data Modeling Process Flow

  1. Requirements Gathering: Verzamel business requirements
  2. Conceptual Design: Creëer high-level entiteit relatie diagram
  3. Logical Design: Definieer attributen, datatypes, constraints
  4. Normalization: Pas normalisatie regels toe (1NF-5NF)
  5. Physical Design: Vertaal naar database-specifiek schema
  6. Implementation: Creëer database met DDL scripts
  7. Validation: Test en valideer het model
  8. Maintenance: Onderhoud en optimaliseer het model

3. Conceptueel Data Model

Conceptueel Model Doel

Het Conceptueel Data Model is een high-level abstractie die de belangrijkste business concepten en hun relaties weergeeft, zonder technische details.

ERD (Entity Relationship Diagram) Basics

-- Conceptueel model voor e-commerce systeem
-- ENTITEITEN:
-- 1. Customer (Klant)
-- 2. Product (Product)
-- 3. Order (Bestelling)
-- 4. Category (Categorie)
-- 5. Supplier (Leverancier)

-- RELATIES:
-- 1. Customer PLACES Order (1:N)
-- 2. Order CONTAINS Product (M:N via OrderItem)
-- 3. Product BELONGS TO Category (N:1)
-- 4. Supplier SUPPLIES Product (1:N)
-- 5. Product HAS Review (1:N)
Conceptueel ERD Visualisatie
┌─────────────┐       ┌──────────┐       ┌──────────┐
│  Customer   │       │  Order   │       │  Product │
├─────────────┤       ├──────────┤       ├──────────┤
│ - ID        │1     N│ - ID     │N     M│ - ID     │
│ - Name      │◀─────▶│ - Date   │◀─────▶│ - Name   │
│ - Email     │       │ - Total  │       │ - Price  │
└─────────────┘       └──────────┘       └──────────┘
       │                      │                  │
       │                      │                  │
       ▼                      ▼                  ▼
┌─────────────┐       ┌──────────┐       ┌──────────┐
│  Address    │       │OrderItem │       │ Category │
├─────────────┤       ├──────────┤       ├──────────┤
│ - ID        │       │ - Qty    │       │ - ID     │
│ - Street    │       │ - Price  │       │ - Name   │
│ - City      │       └──────────┘       └──────────┘
└─────────────┘
            

Entity Definitions en Business Rules

-- ENTITEIT DEFINITIES

-- CUSTOMER (Klant)
-- Beschrijving: Personen of organisaties die producten kopen
-- Business Rules:
-- 1. Elke klant moet een uniek email adres hebben
-- 2. Klanten kunnen meerdere adressen hebben (leveradres, factuuradres)
-- 3. Klanten kunnen meerdere betalingsmethoden opslaan
-- 4. Klanten kunnen reviews schrijven voor producten

-- PRODUCT (Product)
-- Beschrijving: Items die verkocht worden
-- Business Rules:
-- 1. Elk product heeft een unieke SKU (Stock Keeping Unit)
-- 2. Producten behoren tot één primaire categorie
-- 3. Producten kunnen meerdere leveranciers hebben
-- 4. Productvoorraad kan niet negatief zijn

-- ORDER (Bestelling)
-- Beschrijving: Transactie van klant naar winkel
-- Business Rules:
-- 1. Elke order heeft minimaal één product
-- 2. Orders hebben een status lifecycle (Pending → Processing → Shipped → Delivered)
-- 3. Orders kunnen geannuleerd worden binnen 24 uur
-- 4. Orders boven €100 krijgen gratis verzending

-- CATEGORY (Categorie)
-- Beschrijving: Hiërarchische product classificatie
-- Business Rules:
-- 1. Categorieën kunnen subcategorieën hebben
-- 2. Producten kunnen tot meerdere categorieën behoren
-- 3. Categorieën hebben een hiërarchie (bijv. Electronics → Computers → Laptops)

-- SUPPLIER (Leverancier)
-- Beschrijving: Bedrijven die producten leveren
-- Business Rules:
-- 1. Leveranciers hebben een valutabeoordeling (A, B, C, D)
-- 2. Leveranciers kunnen exclusieve producten hebben
-- 3. Levertijden variëren per leverancier

Team nodig voor data architectuur?

Vind ervaren Data Modelers en Database Architects gespecialiseerd in data design

4. Logisch Data Model

Logisch Model Doel

Het Logisch Data Model voegt detail toe aan het conceptuele model door attributen, datatypes, keys en relatie cardinaliteiten te specificeren, maar blijft database-agnostisch.

Entity Definitions met Attributen

-- LOGISCH DATA MODEL: ENTITEITEN MET ATTRIBUTEN

-- ENTITY: Customer
-- Primary Key: customer_id
-- Description: Personen die producten kopen
Attributes:
- customer_id: INTEGER (PK, Auto-increment)
- first_name: VARCHAR(50) NOT NULL
- last_name: VARCHAR(50) NOT NULL
- email: VARCHAR(100) UNIQUE NOT NULL
- phone: VARCHAR(20)
- date_of_birth: DATE
- registration_date: DATETIME DEFAULT CURRENT_TIMESTAMP
- status: ENUM('active', 'inactive', 'suspended') DEFAULT 'active'
- loyalty_points: INTEGER DEFAULT 0
- preferred_language: VARCHAR(10) DEFAULT 'nl'
- tax_id: VARCHAR(20)  -- Voor B2B klanten
- company_name: VARCHAR(100)  -- Voor B2B klanten

-- ENTITY: Product
-- Primary Key: product_id
-- Alternate Key: sku (Stock Keeping Unit)
Attributes:
- product_id: INTEGER (PK, Auto-increment)
- sku: VARCHAR(50) UNIQUE NOT NULL
- product_name: VARCHAR(200) NOT NULL
- description: TEXT
- short_description: VARCHAR(500)
- category_id: INTEGER (FK to Category)
- supplier_id: INTEGER (FK to Supplier)
- unit_price: DECIMAL(10,2) NOT NULL
- cost_price: DECIMAL(10,2)
- weight_kg: DECIMAL(6,3)
- dimensions: VARCHAR(50)  -- Format: "LxBxH" in cm
- color: VARCHAR(30)
- size: VARCHAR(20)
- material: VARCHAR(50)
- warranty_months: INTEGER
- is_active: BOOLEAN DEFAULT TRUE
- is_digital: BOOLEAN DEFAULT FALSE  -- Voor digitale producten
- download_url: VARCHAR(500)  -- Voor digitale producten
- created_at: DATETIME DEFAULT CURRENT_TIMESTAMP
- updated_at: DATETIME ON UPDATE CURRENT_TIMESTAMP

-- ENTITY: Order
-- Primary Key: order_id
Attributes:
- order_id: INTEGER (PK, Auto-increment)
- order_number: VARCHAR(20) UNIQUE NOT NULL  -- Human-readable: ORD-2025-001
- customer_id: INTEGER (FK to Customer) NOT NULL
- order_date: DATETIME DEFAULT CURRENT_TIMESTAMP NOT NULL
- required_date: DATE
- shipped_date: DATETIME
- order_status: ENUM('pending', 'confirmed', 'processing', 'shipped', 'delivered', 'cancelled', 'refunded') DEFAULT 'pending'
- shipping_address_id: INTEGER (FK to Address)
- billing_address_id: INTEGER (FK to Address)
- shipping_method: VARCHAR(50)
- shipping_cost: DECIMAL(10,2) DEFAULT 0
- tax_amount: DECIMAL(10,2) DEFAULT 0
- discount_amount: DECIMAL(10,2) DEFAULT 0
- total_amount: DECIMAL(10,2) NOT NULL
- payment_method: VARCHAR(50)
- payment_status: ENUM('pending', 'paid', 'failed', 'refunded') DEFAULT 'pending'
- payment_transaction_id: VARCHAR(100)
- notes: TEXT
- source: ENUM('website', 'mobile_app', 'phone', 'store') DEFAULT 'website'
- ip_address: VARCHAR(45)  -- Voor IPv6 support
- user_agent: VARCHAR(500)

Relatie Definitions en Cardinaliteiten

-- RELATIE DEFINITIES MET CARDINALITEITEN

-- 1. Customer - Order Relatie
-- Relatie: PLACES
-- Cardinaliteit: One-to-Many (1:N)
-- Business Rules:
-- • Een klant kan meerdere orders plaatsen
-- • Een order behoort tot exact één klant
-- • Wanneer een klant wordt verwijderd, wat gebeurt er met orders?
--   Opties: CASCADE, SET NULL, RESTRICT
-- • Recommendation: SET NULL of archival strategy

-- 2. Order - Product Relatie
-- Relatie: CONTAINS
-- Cardinaliteit: Many-to-Many (M:N)
-- Resolutie: Via intersection entity "OrderItem"
-- Business Rules:
-- • Een order kan meerdere producten bevatten
-- • Een product kan in meerdere orders voorkomen
-- • OrderItem slaat specifieke details per product in order op
-- • Quantity moet >= 1 zijn
-- • Unit_price kan verschillen van product.unit_price (promoties)

-- 3. Product - Category Relatie
-- Relatie: BELONGS_TO
-- Cardinaliteit: Many-to-One (N:1)
-- Business Rules:
-- • Een product behoort tot één primaire categorie
-- • Een categorie kan meerdere producten bevatten
-- • Voor multi-category support: gebruik ProductCategory junction table

-- 4. Supplier - Product Relatie
-- Relatie: SUPPLIES
-- Cardinaliteit: One-to-Many (1:N)
-- Business Rules:
-- • Een leverancier kan meerdere producten leveren
-- • Een product kan door meerdere leveranciers geleverd worden
-- • Voor multi-supplier: gebruik ProductSupplier junction table

-- 5. Customer - Address Relatie
-- Relatie: HAS
-- Cardinaliteit: One-to-Many (1:N)
-- Business Rules:
-- • Een klant kan meerdere adressen hebben
-- • Een adres behoort tot één klant
-- • Address types: shipping, billing, home, work
-- • Primary flag: één adres kan als primair gemarkeerd worden

-- 6. Product - Review Relatie
-- Relatie: HAS
-- Cardinaliteit: One-to-Many (1:N)
-- Business Rules:
-- • Een product kan meerdere reviews hebben
-- • Een review behoort tot één product
-- • Een review wordt geschreven door één klant
-- • Rating moet tussen 1 en 5 zijn

Junction/Intersection Entities

-- JUNCTION ENTITIES VOOR MANY-TO-MANY RELATIES

-- ENTITY: OrderItem (Order regel)
-- Composite Primary Key: (order_id, product_id)
-- Purpose: Resolveert M:N relatie tussen Order en Product
Attributes:
- order_item_id: INTEGER (PK, Auto-increment)  -- Surrogate key voor gemak
- order_id: INTEGER (FK to Order) NOT NULL
- product_id: INTEGER (FK to Product) NOT NULL
- quantity: INTEGER NOT NULL CHECK (quantity > 0)
- unit_price: DECIMAL(10,2) NOT NULL  -- Prijs op moment van bestelling
- discount_percentage: DECIMAL(5,2) DEFAULT 0
- tax_rate: DECIMAL(5,2) DEFAULT 21.0  -- BTW percentage
- notes: VARCHAR(200)  -- Bijv. "Gift wrap requested"
- created_at: DATETIME DEFAULT CURRENT_TIMESTAMP

-- Calculated columns (in query of application layer):
-- • line_total = quantity * unit_price * (1 - discount_percentage/100)
-- • line_tax = line_total * (tax_rate/100)

-- ENTITY: ProductCategory
-- Composite Primary Key: (product_id, category_id)
-- Purpose: Ondersteunt multi-category producten
Attributes:
- product_id: INTEGER (FK to Product) NOT NULL
- category_id: INTEGER (FK to Category) NOT NULL
- is_primary: BOOLEAN DEFAULT FALSE  -- Primaire categorie
- display_order: INTEGER DEFAULT 0  -- Sortering in category listings
- assigned_date: DATE DEFAULT CURRENT_DATE

-- ENTITY: ProductSupplier
-- Composite Primary Key: (product_id, supplier_id)
-- Purpose: Ondersteunt multi-supplier producten
Attributes:
- product_id: INTEGER (FK to Product) NOT NULL
- supplier_id: INTEGER (FK to Supplier) NOT NULL
- is_primary_supplier: BOOLEAN DEFAULT FALSE
- purchase_price: DECIMAL(10,2)  -- Inkoopprijs van deze leverancier
- lead_time_days: INTEGER  -- Levertijd in dagen
- minimum_order_quantity: INTEGER
- is_active: BOOLEAN DEFAULT TRUE
- last_order_date: DATE
- average_rating: DECIMAL(3,2)  -- Leverancier rating voor dit product

-- ENTITY: CustomerAddress
-- Purpose: Beheert meerdere adressen per klant
Attributes:
- address_id: INTEGER (PK, Auto-increment)
- customer_id: INTEGER (FK to Customer) NOT NULL
- address_type: ENUM('shipping', 'billing', 'home', 'work', 'other') NOT NULL
- is_primary: BOOLEAN DEFAULT FALSE
- contact_name: VARCHAR(100)  -- Naam voor dit adres
- address_line1: VARCHAR(200) NOT NULL
- address_line2: VARCHAR(200)
- city: VARCHAR(100) NOT NULL
- state_province: VARCHAR(100)
- postal_code: VARCHAR(20) NOT NULL
- country: VARCHAR(2) NOT NULL  -- ISO country code
- phone: VARCHAR(20)
- instructions: VARCHAR(500)  -- Leverinstructies
- is_active: BOOLEAN DEFAULT TRUE

5. Fysiek Data Model

Fysiek Model Doel

Het Fysiek Data Model is de database-specifieke implementatie van het logische model, inclusief datatypes, indexes, partitions, storage parameters en andere performance optimalisaties.

Complete DDL Script voor PostgreSQL

-- POSTGRESQL PHYSICAL DATA MODEL
-- E-commerce Database Schema

CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE EXTENSION IF NOT EXISTS "pg_trgm";  -- Voor text search

-- ========== CORE TABLES ==========

-- Table: customers
CREATE TABLE customers (
    customer_id SERIAL PRIMARY KEY,
    customer_uuid UUID UNIQUE DEFAULT uuid_generate_v4(),
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) UNIQUE NOT NULL,
    phone VARCHAR(20),
    date_of_birth DATE,
    registration_date TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
    status VARCHAR(20) DEFAULT 'active' CHECK (status IN ('active', 'inactive', 'suspended')),
    loyalty_points INTEGER DEFAULT 0 CHECK (loyalty_points >= 0),
    preferred_language VARCHAR(10) DEFAULT 'nl',
    tax_id VARCHAR(20),
    company_name VARCHAR(100),
    metadata JSONB,  -- Voor flexibele extra velden
    created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
    
    -- Composite constraint voor B2B klanten
    CONSTRAINT b2b_customer_check CHECK (
        (company_name IS NULL AND tax_id IS NULL) 
        OR (company_name IS NOT NULL AND tax_id IS NOT NULL)
    )
);

-- Table: products
CREATE TABLE products (
    product_id SERIAL PRIMARY KEY,
    product_uuid UUID UNIQUE DEFAULT uuid_generate_v4(),
    sku VARCHAR(50) UNIQUE NOT NULL,
    product_name VARCHAR(200) NOT NULL,
    description TEXT,
    short_description VARCHAR(500),
    category_id INTEGER,  -- FK wordt later toegevoegd
    supplier_id INTEGER,  -- FK wordt later toegevoegd
    unit_price DECIMAL(10, 2) NOT NULL CHECK (unit_price >= 0),
    cost_price DECIMAL(10, 2) CHECK (cost_price >= 0),
    weight_kg DECIMAL(6, 3),
    dimensions VARCHAR(50),
    color VARCHAR(30),
    size VARCHAR(20),
    material VARCHAR(50),
    warranty_months INTEGER CHECK (warranty_months >= 0),
    is_active BOOLEAN DEFAULT TRUE,
    is_digital BOOLEAN DEFAULT FALSE,
    download_url VARCHAR(500),
    stock_quantity INTEGER DEFAULT 0 CHECK (stock_quantity >= 0),
    low_stock_threshold INTEGER DEFAULT 10,
    tags VARCHAR[],  -- Array van tags voor search
    specifications JSONB,  -- Flexibele product specificaties
    created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
    
    -- Check voor digitale producten
    CONSTRAINT digital_product_check CHECK (
        NOT is_digital OR (is_digital AND download_url IS NOT NULL)
    )
);

-- Table: orders
CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    order_uuid UUID UNIQUE DEFAULT uuid_generate_v4(),
    order_number VARCHAR(20) UNIQUE NOT NULL,
    customer_id INTEGER NOT NULL,
    order_date TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP NOT NULL,
    required_date DATE,
    shipped_date TIMESTAMPTZ,
    order_status VARCHAR(20) DEFAULT 'pending' 
        CHECK (order_status IN ('pending', 'confirmed', 'processing', 'shipped', 'delivered', 'cancelled', 'refunded')),
    shipping_address_id INTEGER,
    billing_address_id INTEGER,
    shipping_method VARCHAR(50),
    shipping_cost DECIMAL(10, 2) DEFAULT 0 CHECK (shipping_cost >= 0),
    tax_amount DECIMAL(10, 2) DEFAULT 0 CHECK (tax_amount >= 0),
    discount_amount DECIMAL(10, 2) DEFAULT 0 CHECK (discount_amount >= 0),
    total_amount DECIMAL(10, 2) NOT NULL CHECK (total_amount >= 0),
    payment_method VARCHAR(50),
    payment_status VARCHAR(20) DEFAULT 'pending'
        CHECK (payment_status IN ('pending', 'paid', 'failed', 'refunded')),
    payment_transaction_id VARCHAR(100),
    notes TEXT,
    source VARCHAR(20) DEFAULT 'website'
        CHECK (source IN ('website', 'mobile_app', 'phone', 'store')),
    ip_address INET,  -- PostgreSQL specifiek type voor IP addresses
    user_agent VARCHAR(500),
    metadata JSONB,
    created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
    
    -- Calculated column constraint
    CONSTRAINT order_amounts_check CHECK (
        total_amount >= shipping_cost + tax_amount - discount_amount
    )
);

Indexes en Performance Optimalisatie

-- ========== INDEXES ==========

-- Primary key indexes worden automatisch aangemaakt
-- Hier voegen we extra indexes toe voor performance

-- Customers table indexes
CREATE INDEX idx_customers_email ON customers(email);
CREATE INDEX idx_customers_name ON customers(last_name, first_name);
CREATE INDEX idx_customers_status ON customers(status) WHERE status = 'active';
CREATE INDEX idx_customers_registration_date ON customers(registration_date);
CREATE INDEX idx_customers_company ON customers(company_name) WHERE company_name IS NOT NULL;

-- Products table indexes
CREATE INDEX idx_products_sku ON products(sku);
CREATE INDEX idx_products_name ON products(product_name);
CREATE INDEX idx_products_category ON products(category_id);
CREATE INDEX idx_products_price ON products(unit_price);
CREATE INDEX idx_products_active ON products(is_active) WHERE is_active = TRUE;
CREATE INDEX idx_products_stock ON products(stock_quantity) WHERE stock_quantity < low_stock_threshold;
CREATE INDEX idx_products_created ON products(created_at);

-- Full text search index voor product search
CREATE INDEX idx_products_search ON products 
    USING gin(to_tsvector('dutch', product_name || ' ' || coalesce(description, '')));

-- Trigram index voor fuzzy search
CREATE INDEX idx_products_name_trgm ON products USING gin(product_name gin_trgm_ops);

-- Orders table indexes
CREATE INDEX idx_orders_customer ON orders(customer_id);
CREATE INDEX idx_orders_date ON orders(order_date);
CREATE INDEX idx_orders_status ON orders(order_status);
CREATE INDEX idx_orders_status_date ON orders(order_status, order_date) 
    WHERE order_status IN ('pending', 'processing');
CREATE INDEX idx_orders_payment_status ON orders(payment_status);
CREATE INDEX idx_orders_total ON orders(total_amount);
CREATE INDEX idx_orders_shipped ON orders(shipped_date) WHERE shipped_date IS NOT NULL;

-- Composite index voor vaak gebruikte queries
CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date DESC);

-- OrderItems table indexes
CREATE INDEX idx_orderitems_order ON order_items(order_id);
CREATE INDEX idx_orderitems_product ON order_items(product_id);
CREATE INDEX idx_orderitems_order_product ON order_items(order_id, product_id);

-- ========== PARTITIONING ==========
-- Voor grote tables, partitioneren op datum

-- Order history table partitioning (PostgreSQL 10+)
CREATE TABLE orders_partitioned (
    LIKE orders INCLUDING ALL
) PARTITION BY RANGE (order_date);

-- Create partitions per quarter
CREATE TABLE orders_q1_2025 PARTITION OF orders_partitioned
    FOR VALUES FROM ('2025-01-01') TO ('2025-04-01');

CREATE TABLE orders_q2_2025 PARTITION OF orders_partitioned
    FOR VALUES FROM ('2025-04-01') TO ('2025-07-01');

-- ========== FOREIGN KEYS ==========

-- Foreign key constraints
ALTER TABLE orders
    ADD CONSTRAINT fk_orders_customer 
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
    ON DELETE RESTRICT;

ALTER TABLE order_items
    ADD CONSTRAINT fk_orderitems_order
    FOREIGN KEY (order_id) REFERENCES orders(order_id)
    ON DELETE CASCADE,
    ADD CONSTRAINT fk_orderitems_product
    FOREIGN KEY (product_id) REFERENCES products(product_id)
    ON DELETE RESTRICT;

ALTER TABLE products
    ADD CONSTRAINT fk_products_category
    FOREIGN KEY (category_id) REFERENCES categories(category_id)
    ON DELETE SET NULL,
    ADD CONSTRAINT fk_products_supplier
    FOREIGN KEY (supplier_id) REFERENCES suppliers(supplier_id)
    ON DELETE SET NULL;

Views en Materialized Views

-- ========== VIEWS ==========

-- View voor actieve producten met category info
CREATE VIEW vw_active_products AS
SELECT 
    p.product_id,
    p.sku,
    p.product_name,
    p.unit_price,
    p.stock_quantity,
    c.category_name,
    s.supplier_name,
    CASE 
        WHEN p.stock_quantity = 0 THEN 'Out of Stock'
        WHEN p.stock_quantity <= p.low_stock_threshold THEN 'Low Stock'
        ELSE 'In Stock'
    END AS stock_status,
    (
        SELECT AVG(rating) 
        FROM reviews r 
        WHERE r.product_id = p.product_id 
        AND r.is_approved = TRUE
    ) AS average_rating,
    p.created_at
FROM products p
LEFT JOIN categories c ON p.category_id = c.category_id
LEFT JOIN suppliers s ON p.supplier_id = s.supplier_id
WHERE p.is_active = TRUE
ORDER BY p.product_name;

-- View voor customer orders summary
CREATE VIEW vw_customer_orders_summary AS
SELECT 
    c.customer_id,
    CONCAT(c.first_name, ' ', c.last_name) AS customer_name,
    c.email,
    c.registration_date,
    COUNT(o.order_id) AS total_orders,
    SUM(o.total_amount) AS total_spent,
    MAX(o.order_date) AS last_order_date,
    AVG(o.total_amount) AS avg_order_value,
    CASE 
        WHEN COUNT(o.order_id) >= 10 THEN 'VIP'
        WHEN COUNT(o.order_id) >= 5 THEN 'Regular'
        ELSE 'New'
    END AS customer_segment
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_status IN ('delivered', 'shipped', 'processing')
GROUP BY c.customer_id, c.first_name, c.last_name, c.email, c.registration_date;

-- Materialized View voor sales reporting (voor performance)
CREATE MATERIALIZED VIEW mv_daily_sales AS
SELECT 
    DATE(o.order_date) AS sale_date,
    COUNT(DISTINCT o.order_id) AS order_count,
    COUNT(DISTINCT o.customer_id) AS customer_count,
    SUM(o.total_amount) AS total_revenue,
    SUM(oi.quantity) AS total_units_sold,
    AVG(o.total_amount) AS avg_order_value,
    SUM(o.shipping_cost) AS total_shipping,
    SUM(o.tax_amount) AS total_tax,
    SUM(o.discount_amount) AS total_discount
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
WHERE o.order_date >= CURRENT_DATE - INTERVAL '90 days'
    AND o.order_status = 'delivered'
GROUP BY DATE(o.order_date)
ORDER BY sale_date DESC;

-- Index voor de materialized view
CREATE UNIQUE INDEX idx_mv_daily_sales_date ON mv_daily_sales(sale_date);

-- Function om materialized view te refreshen
CREATE OR REPLACE FUNCTION refresh_daily_sales()
RETURNS VOID AS $$
BEGIN
    REFRESH MATERIALIZED VIEW CONCURRENTLY mv_daily_sales;
END;
$$ LANGUAGE plpgsql;

Klaar voor data modeling projecten?

Vind de juiste experts of plaats je Data Modeling vacature

6. Database Normalisatie (1NF t/m 5NF)

Wat is Database Normalisatie?

Database Normalisatie is een proces om een database structuur te organiseren volgens een reeks regels die data redundancy verminderen en data integriteit verbeteren.

Eerste Normaalvorm (1NF) - Atomic Values

-- ❌ VOOR 1NF: Niet-genormaliseerde tabel
CREATE TABLE orders_non_normalized (
    order_id INT,
    customer_name VARCHAR(100),
    products VARCHAR(500),  -- Meerdere producten in één kolom: "Product1,Product2"
    quantities VARCHAR(100),  -- "2,1,3"
    prices VARCHAR(100)     -- "10.99,24.50,5.99"
);

-- Problemen:
-- 1. Multi-value kolommen
-- 2. Niet-atomische waarden
-- 3. Moeilijk te queryen
-- 4. Data integriteit issues

-- ✅ NA 1NF: Genormaliseerd naar atomic values
CREATE TABLE orders_1nf (
    order_id INT,
    customer_id INT,  -- Reference naar customers tabel
    order_date DATE
);

CREATE TABLE order_items_1nf (
    order_item_id INT,
    order_id INT,
    product_id INT,  -- Reference naar products tabel
    quantity INT,
    unit_price DECIMAL(10, 2)
);

-- 1NF Regels:
-- 1. Elke kolom bevat atomic (ondeelbare) waarden
-- 2. Elke kolom bevat waarden van hetzelfde type
-- 3. Elke rij is uniek (geen duplicate rows)
-- 4. De volgorde van rows en columns is niet significant

Tweede Normaalvorm (2NF) - Remove Partial Dependencies

-- ❌ VOOR 2NF: Tabel met partial dependencies
CREATE TABLE order_details_2nf_violation (
    order_id INT,
    product_id INT,
    product_name VARCHAR(100),  -- Dependent alleen op product_id
    category VARCHAR(50),         -- Dependent alleen op product_id
    quantity INT,
    unit_price DECIMAL(10, 2),
    order_date DATE,            -- Dependent alleen op order_id
    customer_id INT,            -- Dependent alleen op order_id
    PRIMARY KEY (order_id, product_id)
);

-- Problemen:
-- 1. Partial dependency: product_name hangt alleen van product_id af
-- 2. Partial dependency: order_date hangt alleen van order_id af
-- 3. Update anomalies: Als productnaam wijzigt, moeten alle orders geüpdatet worden
-- 4. Insert anomalies: Kan geen product toevoegen zonder order
-- 5. Delete anomalies: Als laatste order voor product verwijderd wordt, gaat productinfo verloren

-- ✅ NA 2NF: Verwijder partial dependencies
CREATE TABLE orders_2nf (
    order_id INT PRIMARY KEY,
    order_date DATE,
    customer_id INT,
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

CREATE TABLE products_2nf (
    product_id INT PRIMARY KEY,
    product_name VARCHAR(100),
    category VARCHAR(50),
    standard_price DECIMAL(10, 2)
);

CREATE TABLE order_items_2nf (
    order_id INT,
    product_id INT,
    quantity INT,
    unit_price DECIMAL(10, 2),  -- Prijs op moment van bestelling
    PRIMARY KEY (order_id, product_id),
    FOREIGN KEY (order_id) REFERENCES orders_2nf(order_id),
    FOREIGN KEY (product_id) REFERENCES products_2nf(product_id)
);

-- 2NF Regels:
-- 1. Voldoet aan 1NF
-- 2. Alle non-key attributen zijn fully functional dependent op de primary key
-- 3. Geen partial dependencies (alle attributen moeten van de hele PK afhangen)

Derde Normaalvorm (3NF) - Remove Transitive Dependencies

-- ❌ VOOR 3NF: Tabel met transitive dependencies
CREATE TABLE orders_3nf_violation (
    order_id INT PRIMARY KEY,
    customer_id INT,
    customer_name VARCHAR(100),  -- Transitive: hangt af van customer_id
    customer_city VARCHAR(50),     -- Transitive: hangt af van customer_id
    customer_country VARCHAR(50),  -- Transitive: hangt af van customer_city
    order_date DATE,
    total_amount DECIMAL(10, 2)
);

-- Transitive dependencies:
-- order_id → customer_id → customer_name
-- order_id → customer_id → customer_city → customer_country

-- Problemen:
-- 1. Update anomaly: Customer naam wijzigen in alle orders
-- 2. Insert anomaly: Kan niet city/country zonder customer
-- 3. Delete anomaly: Verwijder customer, verlies city/country info
-- 4. Data inconsistency: Zelfde city kan verschillende country hebben

-- ✅ NA 3NF: Verwijder transitive dependencies
CREATE TABLE customers_3nf (
    customer_id INT PRIMARY KEY,
    customer_name VARCHAR(100),
    city_id INT,
    FOREIGN KEY (city_id) REFERENCES cities(city_id)
);

CREATE TABLE cities_3nf (
    city_id INT PRIMARY KEY,
    city_name VARCHAR(50),
    country_id INT,
    FOREIGN KEY (country_id) REFERENCES countries(country_id)
);

CREATE TABLE countries_3nf (
    country_id INT PRIMARY KEY,
    country_name VARCHAR(50),
    country_code VARCHAR(2)
);

CREATE TABLE orders_3nf (
    order_id INT PRIMARY KEY,
    customer_id INT,
    order_date DATE,
    total_amount DECIMAL(10, 2),
    FOREIGN KEY (customer_id) REFERENCES customers_3nf(customer_id)
);

-- 3NF Regels:
-- 1. Voldoet aan 2NF
-- 2. Geen transitive dependencies
-- 3. Alle non-key attributen zijn direct dependent op primary key
-- "A relation is in 3NF if every non-prime attribute is non-transitively dependent on every candidate key"

Boyce-Codd Normaalvorm (BCNF)

-- ❌ BCNF VIOLATION: Overlapping candidate keys
CREATE TABLE course_enrollments (
    student_id INT,
    course_id INT,
    instructor_id INT,
    classroom VARCHAR(20),
    PRIMARY KEY (student_id, course_id)
);

-- Assumptions:
-- 1. Each course has exactly one instructor
-- 2. Each instructor teaches only one course
-- 3. Each course is in one classroom

-- Functional dependencies:
-- (student_id, course_id) → instructor_id, classroom
-- course_id → instructor_id, classroom
-- instructor_id → course_id

-- Problemen:
-- 1. course_id → instructor_id (non-prime → non-prime)
-- 2. But course_id is not a superkey
-- 3. Violates BCNF: For every FD X → Y, X must be superkey

-- ✅ BCNF NORMALIZATION
CREATE TABLE courses_bcnf (
    course_id INT PRIMARY KEY,
    instructor_id INT UNIQUE,  -- One instructor per course
    classroom VARCHAR(20)
);

CREATE TABLE enrollments_bcnf (
    student_id INT,
    course_id INT,
    PRIMARY KEY (student_id, course_id),
    FOREIGN KEY (course_id) REFERENCES courses_bcnf(course_id)
);

-- BCNF Regels:
-- 1. Voldoet aan 3NF
-- 2. Voor elke functionele dependency X → Y, moet X een superkey zijn
-- 3. Strenger dan 3NF: elimineert alle anomaly-causing functional dependencies
-- 4. In BCNF: elke determinant is een candidate key

Normalisatie Samenvatting

Normaalvorm Doel Regel Voordelen
1NF Atomic values Geen multi-value of composite attributen Eenvoudiger querying, data integriteit
2NF Remove partial dependencies Alle non-key attributen fully dependent op PK Vermindert update anomalies
3NF Remove transitive dependencies Non-key attributen alleen dependent op PK Vermindert data redundancy
BCNF Strenger dan 3NF Elke determinant is candidate key Elimineert alle anomaly-causing FDs
4NF Remove multi-valued dependencies Geen non-trivial MVDs Handelt multi-value attributen
5NF Remove join dependencies Elke join dependency implied by candidate keys Complete decompositie zonder data loss

7. Wanneer denormaliseren?

Denormalisatie Trade-offs

Denormalisatie is het bewust introduceren van redundancy in een genormaliseerd database design om query performance te verbeteren, vaak ten koste van data integriteit en update efficiency.

Wanneer Denormaliseren Toepassen

Read Performance

  • Read-heavy applicaties
  • Complexe joins vermijden
  • Frequent gelezen berekende waarden
  • Data warehouse scenario's

Aggregatie

  • Pre-calculated aggregaties
  • Summary tables
  • Reporting requirements
  • Real-time dashboards

Synchronisatie

  • Slow-changing dimensions
  • Historical data preserveren
  • Audit trails
  • Versioning requirements

Hardware Limits

  • Limited join capabilities
  • Distributed systems
  • NoSQL databases
  • Memory constraints

Denormalisatie Technieken

-- TECHNIQUE 1: Derived Columns (Berekenende kolommen)
-- Voeg een total_price kolom toe om berekeningen te vermijden

ALTER TABLE order_items
ADD COLUMN line_total DECIMAL(10, 2) GENERATED ALWAYS AS 
    (quantity * unit_price * (1 - discount_percentage/100)) STORED;

ALTER TABLE orders
ADD COLUMN item_count INTEGER GENERATED ALWAYS AS 
    (SELECT COUNT(*) FROM order_items WHERE order_id = orders.order_id) STORED;

-- TECHNIQUE 2: Flattened Hierarchies (Afgeplatte hiërarchieën)
-- Sla categorie pad op in product tabel

ALTER TABLE products
ADD COLUMN category_path VARCHAR(200);  -- Bijv: "Electronics/Computers/Laptops"

-- Update trigger om path bij te werken
CREATE OR REPLACE FUNCTION update_category_path()
RETURNS TRIGGER AS $$
BEGIN
    WITH RECURSIVE category_tree AS (
        SELECT category_id, category_name, parent_category_id
        FROM categories
        WHERE category_id = NEW.category_id
        UNION ALL
        SELECT c.category_id, c.category_name, c.parent_category_id
        FROM categories c
        INNER JOIN category_tree ct ON c.category_id = ct.parent_category_id
    )
    SELECT string_agg(category_name, '/' ORDER BY category_id)
    INTO NEW.category_path
    FROM category_tree;
    
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

-- TECHNIQUE 3: Summary Tables (Samenvattings tabellen)
CREATE TABLE daily_sales_summary (
    sale_date DATE PRIMARY KEY,
    order_count INTEGER,
    total_revenue DECIMAL(15, 2),
    customer_count INTEGER,
    avg_order_value DECIMAL(10, 2),
    top_product_id INTEGER,
    top_product_qty INTEGER,
    last_updated TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);

-- Batch job om summary bij te werken
INSERT INTO daily_sales_summary
SELECT 
    DATE(o.order_date) AS sale_date,
    COUNT(DISTINCT o.order_id) AS order_count,
    SUM(o.total_amount) AS total_revenue,
    COUNT(DISTINCT o.customer_id) AS customer_count,
    AVG(o.total_amount) AS avg_order_value,
    (
        SELECT product_id 
        FROM order_items oi
        WHERE oi.order_id = o.order_id
        GROUP BY product_id
        ORDER BY SUM(quantity) DESC
        LIMIT 1
    ) AS top_product_id,
    CURRENT_TIMESTAMP
FROM orders o
WHERE o.order_date >= CURRENT_DATE - INTERVAL '1 day'
GROUP BY DATE(o.order_date)
ON CONFLICT (sale_date) DO UPDATE SET
    order_count = EXCLUDED.order_count,
    total_revenue = EXCLUDED.total_revenue,
    customer_count = EXCLUDED.customer_count,
    avg_order_value = EXCLUDED.avg_order_value,
    last_updated = CURRENT_TIMESTAMP;

8. Data Warehouse Modeling

Dimensional Modeling Principles

Dimensional Modeling is een design techniek voor data warehouses die data structureert in feiten (meetbare gebeurtenissen) en dimensies (beschrijvende attributen) voor optimale query performance en gebruiksgemak.

Star Schema Design

-- STAR SCHEMA: Sales Data Warehouse

-- FACT TABLE: Sales Facts
CREATE TABLE fact_sales (
    sales_key BIGSERIAL PRIMARY KEY,
    date_key INTEGER NOT NULL,
    product_key INTEGER NOT NULL,
    customer_key INTEGER NOT NULL,
    store_key INTEGER NOT NULL,
    promotion_key INTEGER,
    
    -- Measures (Feiten)
    quantity_sold INTEGER NOT NULL,
    unit_price DECIMAL(10, 2) NOT NULL,
    sales_amount DECIMAL(15, 2) NOT NULL,
    cost_amount DECIMAL(15, 2) NOT NULL,
    discount_amount DECIMAL(10, 2) DEFAULT 0,
    tax_amount DECIMAL(10, 2) NOT NULL,
    
    -- Calculated measures
    gross_profit DECIMAL(15, 2) GENERATED ALWAYS AS 
        (sales_amount - cost_amount - discount_amount) STORED,
    profit_margin DECIMAL(5, 2) GENERATED ALWAYS AS 
        CASE WHEN sales_amount > 0 
            THEN ((sales_amount - cost_amount - discount_amount) / sales_amount) * 100
            ELSE 0
        END STORED,
    
    -- Foreign keys
    FOREIGN KEY (date_key) REFERENCES dim_date(date_key),
    FOREIGN KEY (product_key) REFERENCES dim_product(product_key),
    FOREIGN KEY (customer_key) REFERENCES dim_customer(customer_key),
    FOREIGN KEY (store_key) REFERENCES dim_store(store_key),
    FOREIGN KEY (promotion_key) REFERENCES dim_promotion(promotion_key)
);

-- DIMENSION TABLE: Date Dimension
CREATE TABLE dim_date (
    date_key INTEGER PRIMARY KEY,  -- YYYYMMDD format
    full_date DATE NOT NULL UNIQUE,
    
    -- Date attributes
    day_of_week INTEGER NOT NULL,  -- 1=Monday, 7=Sunday
    day_name VARCHAR(10) NOT NULL,      -- Monday, Tuesday...
    day_of_month INTEGER NOT NULL,
    day_of_year INTEGER NOT NULL,
    week_of_year INTEGER NOT NULL,
    month_number INTEGER NOT NULL,
    month_name VARCHAR(10) NOT NULL,
    quarter_number INTEGER NOT NULL,
    quarter_name CHAR(2) NOT NULL,      -- Q1, Q2, Q3, Q4
    year_number INTEGER NOT NULL,
    
    -- Flags
    is_weekend BOOLEAN NOT NULL,
    is_holiday BOOLEAN NOT NULL,
    is_business_day BOOLEAN NOT NULL,
    
    -- Fiscal calendar (indien afwijkend)
    fiscal_year INTEGER,
    fiscal_quarter INTEGER,
    fiscal_month INTEGER,
    
    -- Holiday info
    holiday_name VARCHAR(50),
    holiday_type VARCHAR(20)
);

-- DIMENSION TABLE: Product Dimension
CREATE TABLE dim_product (
    product_key INTEGER PRIMARY KEY,
    product_id INTEGER NOT NULL,  -- Business key
    sku VARCHAR(50) NOT NULL,
    product_name VARCHAR(200) NOT NULL,
    description TEXT,
    
    -- Product hierarchy
    category_key INTEGER,
    category_name VARCHAR(100) NOT NULL,
    subcategory_key INTEGER,
    subcategory_name VARCHAR(100),
    department_key INTEGER,
    department_name VARCHAR(100),
    
    -- Product attributes
    brand VARCHAR(50),
    color VARCHAR(30),
    size VARCHAR(20),
    weight_kg DECIMAL(6, 3),
    unit_price DECIMAL(10, 2),
    cost_price DECIMAL(10, 2),
    
    -- SCD Type 2 columns
    effective_date DATE NOT NULL,
    expiration_date DATE,
    is_current BOOLEAN NOT NULL DEFAULT TRUE,
    
    -- Audit columns
    created_timestamp TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
    updated_timestamp TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);

-- DIMENSION TABLE: Customer Dimension
CREATE TABLE dim_customer (
    customer_key INTEGER PRIMARY KEY,
    customer_id INTEGER NOT NULL,  -- Business key
    customer_type VARCHAR(20) NOT NULL,  -- B2C, B2B
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    full_name VARCHAR(101) GENERATED ALWAYS AS 
        (first_name || ' ' || last_name) STORED,
    email VARCHAR(100),
    
    -- Demographics
    birth_date DATE,
    age INTEGER,
    gender VARCHAR(10),
    
    -- Geography
    city VARCHAR(100),
    state_province VARCHAR(100),
    country VARCHAR(50),
    postal_code VARCHAR(20),
    region VARCHAR(50),
    
    -- Customer segmentation
    customer_segment VARCHAR(50),  -- Gold, Silver, Bronze
    lifetime_value DECIMAL(15, 2),
    acquisition_channel VARCHAR(50),
    acquisition_date DATE,
    
    -- B2B specific
    company_name VARCHAR(100),
    industry VARCHAR(50),
    employee_count_range VARCHAR(20),
    
    -- SCD Type 2
    effective_date DATE NOT NULL,
    expiration_date DATE,
    is_current BOOLEAN NOT NULL DEFAULT TRUE
);

9. NoSQL Data Modeling

NoSQL Modeling Principles

NoSQL Data Modeling volgt verschillende principes dan relationele modellen, met focus op query patterns, denormalisatie, en applicatie-specifieke structuren.

Document Database Modeling (MongoDB)

// MONGODB DOCUMENT MODELING
// Denormalized approach voor read performance

// Collection: customers
{
    "_id": ObjectId("507f1f77bcf86cd799439011"),
    "customer_id": "CUST-1001",
    "name": {
        "first": "John",
        "last": "Doe"
    },
    "email": "john.doe@example.com",
    "phone": "+31612345678",
    "addresses": [
        {
            "type": "shipping",
            "street": "Main Street 123",
            "city": "Amsterdam",
            "postal_code": "1011AB",
            "country": "Netherlands",
            "is_primary": true
        },
        {
            "type": "billing",
            "street": "Business Ave 456",
            "city": "Rotterdam",
            "postal_code": "3011AA",
            "country": "Netherlands",
            "is_primary": true
        }
    ],
    "preferences": {
        "language": "nl",
        "currency": "EUR",
        "timezone": "Europe/Amsterdam",
        "email_marketing": true
    },
    "metadata": {
        "created_at": ISODate("2024-01-15T10:30:00Z"),
        "updated_at": ISODate("2024-03-20T14:45:00Z"),
        "source": "website",
        "ip_address": "192.168.1.1"
    },
    // Embedded orders voor snelle toegang
    "recent_orders": [
        {
            "order_id": "ORD-2024-001",
            "order_date": ISODate("2024-03-15T09:30:00Z"),
            "status": "delivered",
            "total_amount": 149.99,
            "items_count": 3
        },
        {
            "order_id": "ORD-2024-002",
            "order_date": ISODate("2024-02-28T14:20:00Z"),
            "status": "delivered",
            "total_amount": 89.50,
            "items_count": 1
        }
    ],
    // Calculated fields
    "stats": {
        "total_orders": 15,
        "total_spent": 2450.75,
        "average_order_value": 163.38,
        "last_order_date": ISODate("2024-03-15T09:30:00Z")
    }
}

// Collection: orders (embedded approach)
{
    "_id": ObjectId("507f1f77bcf86cd799439012"),
    "order_id": "ORD-2024-001",
    "order_number": "20240001",
    "customer": {
        "customer_id": "CUST-1001",
        "name": "John Doe",
        "email": "john.doe@example.com"
    },
    "order_date": ISODate("2024-03-15T09:30:00Z"),
    "status": "delivered",
    "status_history": [
        {
            "status": "pending",
            "timestamp": ISODate("2024-03-15T09:30:00Z"),
            "notes": "Order placed"
        },
        {
            "status": "confirmed",
            "timestamp": ISODate("2024-03-15T09:35:00Z"),
            "notes": "Payment received"
        },
        {
            "status": "shipped",
            "timestamp": ISODate("2024-03-16T14:20:00Z"),
            "notes": "Shipped via DHL"
        },
        {
            "status": "delivered",
            "timestamp": ISODate("2024-03-18T10:15:00Z"),
            "notes": "Delivered to customer"
        }
    ],
    // Embedded items (denormalized product data)
    "items": [
        {
            "product_id": "PROD-001",
            "sku": "LAPTOP-X1",
            "name": "High Performance Laptop",
            "quantity": 1,
            "unit_price": 1299.99,
            "discount_percentage": 10,
            "category": "Electronics/Laptops",
            "brand": "TechBrand",
            "weight_kg": 2.5
        },
        {
            "product_id": "PROD-002",
            "sku": "MOUSE-G5",
            "name": "Gaming Mouse",
            "quantity": 2,
            "unit_price": 49.99,
            "category": "Electronics/Peripherals",
            "brand": "GameGear"
        }
    ],
    "shipping": {
        "address": {
            "street": "Main Street 123",
            "city": "Amsterdam",
            "postal_code": "1011AB",
            "country": "Netherlands"
        },
        "method": "express",
        "cost": 9.99,
        "tracking_number": "DHL123456789",
        "estimated_delivery": ISODate("2024-03-18T23:59:59Z")
    },
    "payment": {
        "method": "credit_card",
        "status": "paid",
        "transaction_id": "TXN-789012",
        "amount": 149.99,
        "currency": "EUR"
    },
    "totals": {
        "items_total": 1399.97,
        "shipping": 9.99,
        "discount": 129.99,
        "tax": 268.19,
        "grand_total": 1547.16
    },
    "metadata": {
        "created_at": ISODate("2024-03-15T09:30:00Z"),
        "updated_at": ISODate("2024-03-18T10:15:00Z"),
        "source": "website",
        "ip_address": "192.168.1.100",
        "user_agent": "Mozilla/5.0..."
    }
}

// Indexes voor optimalisatie
db.customers.createIndex({ "email": 1 }, { unique: true });
db.customers.createIndex({ "customer_id": 1 }, { unique: true });
db.customers.createIndex({ "name.last": 1, "name.first": 1 });
db.customers.createIndex({ "addresses.city": 1 });
db.customers.createIndex({ "stats.total_spent": -1 });

db.orders.createIndex({ "order_id": 1 }, { unique: true });
db.orders.createIndex({ "customer.customer_id": 1 });
db.orders.createIndex({ "order_date": -1 });
db.orders.createIndex({ "status": 1, "order_date": -1 });
db.orders.createIndex({ "items.product_id": 1 });
db.orders.createIndex({ "totals.grand_total": -1 });

10. Data Modeling Best Practices

Essentiële Richtlijnen

Volg deze best practices voor effectief en schaalbaar data modeling.

Business Focus

  • Start met business requirements
  • Betrek stakeholders vroeg
  • Model vanuit business processen
  • Documenteer business rules

Schaalbaarheid

  • Design voor toekomstige groei
  • Overweeg partitioning vroeg
  • Houd performance in gedachten
  • Plan voor data volume groei

Data Integriteit

  • Implementeer constraints
  • Gebruik foreign keys
  • Valideer data types
  • Handel null waarden consistent

Documentatie

  • Documenteer assumpties
  • Behoud model versies
  • Documenteer wijzigingen
  • Maak ERD diagrammen

Praktische Tips

-- TIP 1: Gebruik consistente naming conventions
-- Tabellen: meervoud, snake_case
-- Kolommen: snake_case
-- Primary keys: tablename_id
-- Foreign keys: referencedtable_id

-- TIP 2: Implementeer audit columns
ALTER TABLE customers
ADD COLUMN created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
ADD COLUMN updated_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
ADD COLUMN created_by VARCHAR(50),
ADD COLUMN updated_by VARCHAR(50);

-- TIP 3: Gebruik soft delete i.p.v. hard delete
ALTER TABLE products
ADD COLUMN is_deleted BOOLEAN DEFAULT FALSE,
ADD COLUMN deleted_at TIMESTAMPTZ,
ADD COLUMN deleted_by VARCHAR(50);

-- TIP 4: Normaliseer eerst, denormaliseer later
-- 1. Begin met genormaliseerd model (3NF)
-- 2. Meet performance
-- 3. Denormaliseer alleen waar nodig
-- 4. Documenteer denormalisatie beslissingen

-- TIP 5: Gebruik database features
-- • Generated columns voor berekende waarden
-- • JSON columns voor flexibele data
-- • Array types voor lists
-- • Full-text search voor text queries

11. Tools en Frameworks

Populaire Data Modeling Tools

Selecteer de juiste tools voor jouw data modeling behoeften.

Enterprise Tools

  • ER/Studio: Comprehensive data architecture
  • IBM InfoSphere Data Architect: Enterprise modeling
  • Oracle SQL Developer Data Modeler: Oracle ecosystem
  • SAP PowerDesigner: Business process modeling

Open Source Tools

  • pgModeler: PostgreSQL specific modeling
  • MySQL Workbench: MySQL visual design
  • DBeaver: Universal database tool
  • DbSchema: Visual database designer

Cloud Tools

  • Lucidchart: Cloud-based diagramming
  • Draw.io: Free diagram tool
  • dbdiagram.io: Database diagram as code
  • QuickDBD: Text to diagram tool

Code-First Tools

  • SQLAlchemy: Python ORM
  • Entity Framework: .NET ORM
  • Prisma: Node.js ORM
  • Django ORM: Django framework

Conclusie

Data modeling is een kritische discipline die de basis legt voor succesvolle data-driven applicaties. Door de principes van conceptual, logical, en physical modeling te volgen, samen met gepaste normalisatie en denormalisatie technieken, kunnen ontwikkelaars schaalbare, performante en onderhoudbare databases ontwerpen.

Key Takeaways:

  • Start met business requirements en werk van conceptueel naar fysiek model
  • Normaliseer tot 3NF voor transactionele systemen, denormaliseer voor analytics
  • Kies het juiste model (relational, dimensional, document) voor je use case
  • Documenteer je modellen en behoud versie controle
  • Test je modellen met realistische data volumes en query patterns

Klaar om je data modeling skills toe te passen?

Vind data modeling vacatures of plaats je eigen vacature voor €25