Data Modeling: Complete Gids voor Database Design
Leer data modeling van conceptueel naar fysiek model. Master ERD, normalisatie en database design best practices voor schaalbare data oplossingen.
Zoek je Data Modelers?
Vind ervaren Data Architects en Database Designers voor je data projecten
Inhoudsopgave
1. Wat is Data Modeling?
Data Modeling Definitie
Data Modeling is het proces van het creëren van een visuele representatie van een informatiesysteem om de data elementen en hun onderlinge relaties te communiceren. Het is de blauwdruk voor database design.
Visualisatie
Creëer duidelijke diagrammen van data structuren
Relaties
Definieer connecties tussen verschillende data entiteiten
Structuur
Ontwerp georganiseerde en efficiënte data opslag
Implementatie
Vertaal modellen naar werkende database schema's
| Data Modeling Type | Doel | Doelgroep | Output |
|---|---|---|---|
| Conceptueel | Business requirements begrijpen | Business stakeholders | High-level entiteiten en relaties |
| Logisch | Data structuren ontwerpen | Data architects, analysts | Gedetailleerd model met attributen |
| Fysiek | Database implementatie | Database administrators, developers | SQL DDL scripts, indexes, constraints |
Waarom Data Modeling belangrijk is
Data Quality
- Voorkomt data inconsistencies
- Zorgt voor data integriteit
- Reduceert data redundancy
Performance
- Optimaliseert query performance
- Reduceert storage requirements
- Verbetert scalability
Collaboration
- Faciliteert communicatie
- Documenteert data structuren
- Ondersteunt knowledge transfer
Maintenance
- Vereenvoudigt wijzigingen
- Ondersteunt data governance
- Faciliteert troubleshooting
2. Drie lagen van Data Modeling
Drie-tier Data Modeling Approach
Data modeling gebeurt typisch in drie fasen, elk met toenemend detailniveau:
- Conceptueel Model: Business requirements en high-level entiteiten
- Logisch Model: Gedetailleerde data structuren en relaties
- Fysiek Model: Database-specifieke implementatie details
Data Modeling Workflow
Data Modeling Process Flow
- Requirements Gathering: Verzamel business requirements
- Conceptual Design: Creëer high-level entiteit relatie diagram
- Logical Design: Definieer attributen, datatypes, constraints
- Normalization: Pas normalisatie regels toe (1NF-5NF)
- Physical Design: Vertaal naar database-specifiek schema
- Implementation: Creëer database met DDL scripts
- Validation: Test en valideer het model
- Maintenance: Onderhoud en optimaliseer het model
3. Conceptueel Data Model
Conceptueel Model Doel
Het Conceptueel Data Model is een high-level abstractie die de belangrijkste business concepten en hun relaties weergeeft, zonder technische details.
ERD (Entity Relationship Diagram) Basics
-- Conceptueel model voor e-commerce systeem
-- ENTITEITEN:
-- 1. Customer (Klant)
-- 2. Product (Product)
-- 3. Order (Bestelling)
-- 4. Category (Categorie)
-- 5. Supplier (Leverancier)
-- RELATIES:
-- 1. Customer PLACES Order (1:N)
-- 2. Order CONTAINS Product (M:N via OrderItem)
-- 3. Product BELONGS TO Category (N:1)
-- 4. Supplier SUPPLIES Product (1:N)
-- 5. Product HAS Review (1:N)
Conceptueel ERD Visualisatie
┌─────────────┐ ┌──────────┐ ┌──────────┐
│ Customer │ │ Order │ │ Product │
├─────────────┤ ├──────────┤ ├──────────┤
│ - ID │1 N│ - ID │N M│ - ID │
│ - Name │◀─────▶│ - Date │◀─────▶│ - Name │
│ - Email │ │ - Total │ │ - Price │
└─────────────┘ └──────────┘ └──────────┘
│ │ │
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌──────────┐ ┌──────────┐
│ Address │ │OrderItem │ │ Category │
├─────────────┤ ├──────────┤ ├──────────┤
│ - ID │ │ - Qty │ │ - ID │
│ - Street │ │ - Price │ │ - Name │
│ - City │ └──────────┘ └──────────┘
└─────────────┘
Entity Definitions en Business Rules
-- ENTITEIT DEFINITIES
-- CUSTOMER (Klant)
-- Beschrijving: Personen of organisaties die producten kopen
-- Business Rules:
-- 1. Elke klant moet een uniek email adres hebben
-- 2. Klanten kunnen meerdere adressen hebben (leveradres, factuuradres)
-- 3. Klanten kunnen meerdere betalingsmethoden opslaan
-- 4. Klanten kunnen reviews schrijven voor producten
-- PRODUCT (Product)
-- Beschrijving: Items die verkocht worden
-- Business Rules:
-- 1. Elk product heeft een unieke SKU (Stock Keeping Unit)
-- 2. Producten behoren tot één primaire categorie
-- 3. Producten kunnen meerdere leveranciers hebben
-- 4. Productvoorraad kan niet negatief zijn
-- ORDER (Bestelling)
-- Beschrijving: Transactie van klant naar winkel
-- Business Rules:
-- 1. Elke order heeft minimaal één product
-- 2. Orders hebben een status lifecycle (Pending → Processing → Shipped → Delivered)
-- 3. Orders kunnen geannuleerd worden binnen 24 uur
-- 4. Orders boven €100 krijgen gratis verzending
-- CATEGORY (Categorie)
-- Beschrijving: Hiërarchische product classificatie
-- Business Rules:
-- 1. Categorieën kunnen subcategorieën hebben
-- 2. Producten kunnen tot meerdere categorieën behoren
-- 3. Categorieën hebben een hiërarchie (bijv. Electronics → Computers → Laptops)
-- SUPPLIER (Leverancier)
-- Beschrijving: Bedrijven die producten leveren
-- Business Rules:
-- 1. Leveranciers hebben een valutabeoordeling (A, B, C, D)
-- 2. Leveranciers kunnen exclusieve producten hebben
-- 3. Levertijden variëren per leverancier
Team nodig voor data architectuur?
Vind ervaren Data Modelers en Database Architects gespecialiseerd in data design
4. Logisch Data Model
Logisch Model Doel
Het Logisch Data Model voegt detail toe aan het conceptuele model door attributen, datatypes, keys en relatie cardinaliteiten te specificeren, maar blijft database-agnostisch.
Entity Definitions met Attributen
-- LOGISCH DATA MODEL: ENTITEITEN MET ATTRIBUTEN
-- ENTITY: Customer
-- Primary Key: customer_id
-- Description: Personen die producten kopen
Attributes:
- customer_id: INTEGER (PK, Auto-increment)
- first_name: VARCHAR(50) NOT NULL
- last_name: VARCHAR(50) NOT NULL
- email: VARCHAR(100) UNIQUE NOT NULL
- phone: VARCHAR(20)
- date_of_birth: DATE
- registration_date: DATETIME DEFAULT CURRENT_TIMESTAMP
- status: ENUM('active', 'inactive', 'suspended') DEFAULT 'active'
- loyalty_points: INTEGER DEFAULT 0
- preferred_language: VARCHAR(10) DEFAULT 'nl'
- tax_id: VARCHAR(20) -- Voor B2B klanten
- company_name: VARCHAR(100) -- Voor B2B klanten
-- ENTITY: Product
-- Primary Key: product_id
-- Alternate Key: sku (Stock Keeping Unit)
Attributes:
- product_id: INTEGER (PK, Auto-increment)
- sku: VARCHAR(50) UNIQUE NOT NULL
- product_name: VARCHAR(200) NOT NULL
- description: TEXT
- short_description: VARCHAR(500)
- category_id: INTEGER (FK to Category)
- supplier_id: INTEGER (FK to Supplier)
- unit_price: DECIMAL(10,2) NOT NULL
- cost_price: DECIMAL(10,2)
- weight_kg: DECIMAL(6,3)
- dimensions: VARCHAR(50) -- Format: "LxBxH" in cm
- color: VARCHAR(30)
- size: VARCHAR(20)
- material: VARCHAR(50)
- warranty_months: INTEGER
- is_active: BOOLEAN DEFAULT TRUE
- is_digital: BOOLEAN DEFAULT FALSE -- Voor digitale producten
- download_url: VARCHAR(500) -- Voor digitale producten
- created_at: DATETIME DEFAULT CURRENT_TIMESTAMP
- updated_at: DATETIME ON UPDATE CURRENT_TIMESTAMP
-- ENTITY: Order
-- Primary Key: order_id
Attributes:
- order_id: INTEGER (PK, Auto-increment)
- order_number: VARCHAR(20) UNIQUE NOT NULL -- Human-readable: ORD-2025-001
- customer_id: INTEGER (FK to Customer) NOT NULL
- order_date: DATETIME DEFAULT CURRENT_TIMESTAMP NOT NULL
- required_date: DATE
- shipped_date: DATETIME
- order_status: ENUM('pending', 'confirmed', 'processing', 'shipped', 'delivered', 'cancelled', 'refunded') DEFAULT 'pending'
- shipping_address_id: INTEGER (FK to Address)
- billing_address_id: INTEGER (FK to Address)
- shipping_method: VARCHAR(50)
- shipping_cost: DECIMAL(10,2) DEFAULT 0
- tax_amount: DECIMAL(10,2) DEFAULT 0
- discount_amount: DECIMAL(10,2) DEFAULT 0
- total_amount: DECIMAL(10,2) NOT NULL
- payment_method: VARCHAR(50)
- payment_status: ENUM('pending', 'paid', 'failed', 'refunded') DEFAULT 'pending'
- payment_transaction_id: VARCHAR(100)
- notes: TEXT
- source: ENUM('website', 'mobile_app', 'phone', 'store') DEFAULT 'website'
- ip_address: VARCHAR(45) -- Voor IPv6 support
- user_agent: VARCHAR(500)
Relatie Definitions en Cardinaliteiten
-- RELATIE DEFINITIES MET CARDINALITEITEN
-- 1. Customer - Order Relatie
-- Relatie: PLACES
-- Cardinaliteit: One-to-Many (1:N)
-- Business Rules:
-- • Een klant kan meerdere orders plaatsen
-- • Een order behoort tot exact één klant
-- • Wanneer een klant wordt verwijderd, wat gebeurt er met orders?
-- Opties: CASCADE, SET NULL, RESTRICT
-- • Recommendation: SET NULL of archival strategy
-- 2. Order - Product Relatie
-- Relatie: CONTAINS
-- Cardinaliteit: Many-to-Many (M:N)
-- Resolutie: Via intersection entity "OrderItem"
-- Business Rules:
-- • Een order kan meerdere producten bevatten
-- • Een product kan in meerdere orders voorkomen
-- • OrderItem slaat specifieke details per product in order op
-- • Quantity moet >= 1 zijn
-- • Unit_price kan verschillen van product.unit_price (promoties)
-- 3. Product - Category Relatie
-- Relatie: BELONGS_TO
-- Cardinaliteit: Many-to-One (N:1)
-- Business Rules:
-- • Een product behoort tot één primaire categorie
-- • Een categorie kan meerdere producten bevatten
-- • Voor multi-category support: gebruik ProductCategory junction table
-- 4. Supplier - Product Relatie
-- Relatie: SUPPLIES
-- Cardinaliteit: One-to-Many (1:N)
-- Business Rules:
-- • Een leverancier kan meerdere producten leveren
-- • Een product kan door meerdere leveranciers geleverd worden
-- • Voor multi-supplier: gebruik ProductSupplier junction table
-- 5. Customer - Address Relatie
-- Relatie: HAS
-- Cardinaliteit: One-to-Many (1:N)
-- Business Rules:
-- • Een klant kan meerdere adressen hebben
-- • Een adres behoort tot één klant
-- • Address types: shipping, billing, home, work
-- • Primary flag: één adres kan als primair gemarkeerd worden
-- 6. Product - Review Relatie
-- Relatie: HAS
-- Cardinaliteit: One-to-Many (1:N)
-- Business Rules:
-- • Een product kan meerdere reviews hebben
-- • Een review behoort tot één product
-- • Een review wordt geschreven door één klant
-- • Rating moet tussen 1 en 5 zijn
Junction/Intersection Entities
-- JUNCTION ENTITIES VOOR MANY-TO-MANY RELATIES
-- ENTITY: OrderItem (Order regel)
-- Composite Primary Key: (order_id, product_id)
-- Purpose: Resolveert M:N relatie tussen Order en Product
Attributes:
- order_item_id: INTEGER (PK, Auto-increment) -- Surrogate key voor gemak
- order_id: INTEGER (FK to Order) NOT NULL
- product_id: INTEGER (FK to Product) NOT NULL
- quantity: INTEGER NOT NULL CHECK (quantity > 0)
- unit_price: DECIMAL(10,2) NOT NULL -- Prijs op moment van bestelling
- discount_percentage: DECIMAL(5,2) DEFAULT 0
- tax_rate: DECIMAL(5,2) DEFAULT 21.0 -- BTW percentage
- notes: VARCHAR(200) -- Bijv. "Gift wrap requested"
- created_at: DATETIME DEFAULT CURRENT_TIMESTAMP
-- Calculated columns (in query of application layer):
-- • line_total = quantity * unit_price * (1 - discount_percentage/100)
-- • line_tax = line_total * (tax_rate/100)
-- ENTITY: ProductCategory
-- Composite Primary Key: (product_id, category_id)
-- Purpose: Ondersteunt multi-category producten
Attributes:
- product_id: INTEGER (FK to Product) NOT NULL
- category_id: INTEGER (FK to Category) NOT NULL
- is_primary: BOOLEAN DEFAULT FALSE -- Primaire categorie
- display_order: INTEGER DEFAULT 0 -- Sortering in category listings
- assigned_date: DATE DEFAULT CURRENT_DATE
-- ENTITY: ProductSupplier
-- Composite Primary Key: (product_id, supplier_id)
-- Purpose: Ondersteunt multi-supplier producten
Attributes:
- product_id: INTEGER (FK to Product) NOT NULL
- supplier_id: INTEGER (FK to Supplier) NOT NULL
- is_primary_supplier: BOOLEAN DEFAULT FALSE
- purchase_price: DECIMAL(10,2) -- Inkoopprijs van deze leverancier
- lead_time_days: INTEGER -- Levertijd in dagen
- minimum_order_quantity: INTEGER
- is_active: BOOLEAN DEFAULT TRUE
- last_order_date: DATE
- average_rating: DECIMAL(3,2) -- Leverancier rating voor dit product
-- ENTITY: CustomerAddress
-- Purpose: Beheert meerdere adressen per klant
Attributes:
- address_id: INTEGER (PK, Auto-increment)
- customer_id: INTEGER (FK to Customer) NOT NULL
- address_type: ENUM('shipping', 'billing', 'home', 'work', 'other') NOT NULL
- is_primary: BOOLEAN DEFAULT FALSE
- contact_name: VARCHAR(100) -- Naam voor dit adres
- address_line1: VARCHAR(200) NOT NULL
- address_line2: VARCHAR(200)
- city: VARCHAR(100) NOT NULL
- state_province: VARCHAR(100)
- postal_code: VARCHAR(20) NOT NULL
- country: VARCHAR(2) NOT NULL -- ISO country code
- phone: VARCHAR(20)
- instructions: VARCHAR(500) -- Leverinstructies
- is_active: BOOLEAN DEFAULT TRUE
5. Fysiek Data Model
Fysiek Model Doel
Het Fysiek Data Model is de database-specifieke implementatie van het logische model, inclusief datatypes, indexes, partitions, storage parameters en andere performance optimalisaties.
Complete DDL Script voor PostgreSQL
-- POSTGRESQL PHYSICAL DATA MODEL
-- E-commerce Database Schema
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE EXTENSION IF NOT EXISTS "pg_trgm"; -- Voor text search
-- ========== CORE TABLES ==========
-- Table: customers
CREATE TABLE customers (
customer_id SERIAL PRIMARY KEY,
customer_uuid UUID UNIQUE DEFAULT uuid_generate_v4(),
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL,
phone VARCHAR(20),
date_of_birth DATE,
registration_date TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
status VARCHAR(20) DEFAULT 'active' CHECK (status IN ('active', 'inactive', 'suspended')),
loyalty_points INTEGER DEFAULT 0 CHECK (loyalty_points >= 0),
preferred_language VARCHAR(10) DEFAULT 'nl',
tax_id VARCHAR(20),
company_name VARCHAR(100),
metadata JSONB, -- Voor flexibele extra velden
created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
-- Composite constraint voor B2B klanten
CONSTRAINT b2b_customer_check CHECK (
(company_name IS NULL AND tax_id IS NULL)
OR (company_name IS NOT NULL AND tax_id IS NOT NULL)
)
);
-- Table: products
CREATE TABLE products (
product_id SERIAL PRIMARY KEY,
product_uuid UUID UNIQUE DEFAULT uuid_generate_v4(),
sku VARCHAR(50) UNIQUE NOT NULL,
product_name VARCHAR(200) NOT NULL,
description TEXT,
short_description VARCHAR(500),
category_id INTEGER, -- FK wordt later toegevoegd
supplier_id INTEGER, -- FK wordt later toegevoegd
unit_price DECIMAL(10, 2) NOT NULL CHECK (unit_price >= 0),
cost_price DECIMAL(10, 2) CHECK (cost_price >= 0),
weight_kg DECIMAL(6, 3),
dimensions VARCHAR(50),
color VARCHAR(30),
size VARCHAR(20),
material VARCHAR(50),
warranty_months INTEGER CHECK (warranty_months >= 0),
is_active BOOLEAN DEFAULT TRUE,
is_digital BOOLEAN DEFAULT FALSE,
download_url VARCHAR(500),
stock_quantity INTEGER DEFAULT 0 CHECK (stock_quantity >= 0),
low_stock_threshold INTEGER DEFAULT 10,
tags VARCHAR[], -- Array van tags voor search
specifications JSONB, -- Flexibele product specificaties
created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
-- Check voor digitale producten
CONSTRAINT digital_product_check CHECK (
NOT is_digital OR (is_digital AND download_url IS NOT NULL)
)
);
-- Table: orders
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
order_uuid UUID UNIQUE DEFAULT uuid_generate_v4(),
order_number VARCHAR(20) UNIQUE NOT NULL,
customer_id INTEGER NOT NULL,
order_date TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP NOT NULL,
required_date DATE,
shipped_date TIMESTAMPTZ,
order_status VARCHAR(20) DEFAULT 'pending'
CHECK (order_status IN ('pending', 'confirmed', 'processing', 'shipped', 'delivered', 'cancelled', 'refunded')),
shipping_address_id INTEGER,
billing_address_id INTEGER,
shipping_method VARCHAR(50),
shipping_cost DECIMAL(10, 2) DEFAULT 0 CHECK (shipping_cost >= 0),
tax_amount DECIMAL(10, 2) DEFAULT 0 CHECK (tax_amount >= 0),
discount_amount DECIMAL(10, 2) DEFAULT 0 CHECK (discount_amount >= 0),
total_amount DECIMAL(10, 2) NOT NULL CHECK (total_amount >= 0),
payment_method VARCHAR(50),
payment_status VARCHAR(20) DEFAULT 'pending'
CHECK (payment_status IN ('pending', 'paid', 'failed', 'refunded')),
payment_transaction_id VARCHAR(100),
notes TEXT,
source VARCHAR(20) DEFAULT 'website'
CHECK (source IN ('website', 'mobile_app', 'phone', 'store')),
ip_address INET, -- PostgreSQL specifiek type voor IP addresses
user_agent VARCHAR(500),
metadata JSONB,
created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
-- Calculated column constraint
CONSTRAINT order_amounts_check CHECK (
total_amount >= shipping_cost + tax_amount - discount_amount
)
);
Indexes en Performance Optimalisatie
-- ========== INDEXES ==========
-- Primary key indexes worden automatisch aangemaakt
-- Hier voegen we extra indexes toe voor performance
-- Customers table indexes
CREATE INDEX idx_customers_email ON customers(email);
CREATE INDEX idx_customers_name ON customers(last_name, first_name);
CREATE INDEX idx_customers_status ON customers(status) WHERE status = 'active';
CREATE INDEX idx_customers_registration_date ON customers(registration_date);
CREATE INDEX idx_customers_company ON customers(company_name) WHERE company_name IS NOT NULL;
-- Products table indexes
CREATE INDEX idx_products_sku ON products(sku);
CREATE INDEX idx_products_name ON products(product_name);
CREATE INDEX idx_products_category ON products(category_id);
CREATE INDEX idx_products_price ON products(unit_price);
CREATE INDEX idx_products_active ON products(is_active) WHERE is_active = TRUE;
CREATE INDEX idx_products_stock ON products(stock_quantity) WHERE stock_quantity < low_stock_threshold;
CREATE INDEX idx_products_created ON products(created_at);
-- Full text search index voor product search
CREATE INDEX idx_products_search ON products
USING gin(to_tsvector('dutch', product_name || ' ' || coalesce(description, '')));
-- Trigram index voor fuzzy search
CREATE INDEX idx_products_name_trgm ON products USING gin(product_name gin_trgm_ops);
-- Orders table indexes
CREATE INDEX idx_orders_customer ON orders(customer_id);
CREATE INDEX idx_orders_date ON orders(order_date);
CREATE INDEX idx_orders_status ON orders(order_status);
CREATE INDEX idx_orders_status_date ON orders(order_status, order_date)
WHERE order_status IN ('pending', 'processing');
CREATE INDEX idx_orders_payment_status ON orders(payment_status);
CREATE INDEX idx_orders_total ON orders(total_amount);
CREATE INDEX idx_orders_shipped ON orders(shipped_date) WHERE shipped_date IS NOT NULL;
-- Composite index voor vaak gebruikte queries
CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date DESC);
-- OrderItems table indexes
CREATE INDEX idx_orderitems_order ON order_items(order_id);
CREATE INDEX idx_orderitems_product ON order_items(product_id);
CREATE INDEX idx_orderitems_order_product ON order_items(order_id, product_id);
-- ========== PARTITIONING ==========
-- Voor grote tables, partitioneren op datum
-- Order history table partitioning (PostgreSQL 10+)
CREATE TABLE orders_partitioned (
LIKE orders INCLUDING ALL
) PARTITION BY RANGE (order_date);
-- Create partitions per quarter
CREATE TABLE orders_q1_2025 PARTITION OF orders_partitioned
FOR VALUES FROM ('2025-01-01') TO ('2025-04-01');
CREATE TABLE orders_q2_2025 PARTITION OF orders_partitioned
FOR VALUES FROM ('2025-04-01') TO ('2025-07-01');
-- ========== FOREIGN KEYS ==========
-- Foreign key constraints
ALTER TABLE orders
ADD CONSTRAINT fk_orders_customer
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
ON DELETE RESTRICT;
ALTER TABLE order_items
ADD CONSTRAINT fk_orderitems_order
FOREIGN KEY (order_id) REFERENCES orders(order_id)
ON DELETE CASCADE,
ADD CONSTRAINT fk_orderitems_product
FOREIGN KEY (product_id) REFERENCES products(product_id)
ON DELETE RESTRICT;
ALTER TABLE products
ADD CONSTRAINT fk_products_category
FOREIGN KEY (category_id) REFERENCES categories(category_id)
ON DELETE SET NULL,
ADD CONSTRAINT fk_products_supplier
FOREIGN KEY (supplier_id) REFERENCES suppliers(supplier_id)
ON DELETE SET NULL;
Views en Materialized Views
-- ========== VIEWS ==========
-- View voor actieve producten met category info
CREATE VIEW vw_active_products AS
SELECT
p.product_id,
p.sku,
p.product_name,
p.unit_price,
p.stock_quantity,
c.category_name,
s.supplier_name,
CASE
WHEN p.stock_quantity = 0 THEN 'Out of Stock'
WHEN p.stock_quantity <= p.low_stock_threshold THEN 'Low Stock'
ELSE 'In Stock'
END AS stock_status,
(
SELECT AVG(rating)
FROM reviews r
WHERE r.product_id = p.product_id
AND r.is_approved = TRUE
) AS average_rating,
p.created_at
FROM products p
LEFT JOIN categories c ON p.category_id = c.category_id
LEFT JOIN suppliers s ON p.supplier_id = s.supplier_id
WHERE p.is_active = TRUE
ORDER BY p.product_name;
-- View voor customer orders summary
CREATE VIEW vw_customer_orders_summary AS
SELECT
c.customer_id,
CONCAT(c.first_name, ' ', c.last_name) AS customer_name,
c.email,
c.registration_date,
COUNT(o.order_id) AS total_orders,
SUM(o.total_amount) AS total_spent,
MAX(o.order_date) AS last_order_date,
AVG(o.total_amount) AS avg_order_value,
CASE
WHEN COUNT(o.order_id) >= 10 THEN 'VIP'
WHEN COUNT(o.order_id) >= 5 THEN 'Regular'
ELSE 'New'
END AS customer_segment
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_status IN ('delivered', 'shipped', 'processing')
GROUP BY c.customer_id, c.first_name, c.last_name, c.email, c.registration_date;
-- Materialized View voor sales reporting (voor performance)
CREATE MATERIALIZED VIEW mv_daily_sales AS
SELECT
DATE(o.order_date) AS sale_date,
COUNT(DISTINCT o.order_id) AS order_count,
COUNT(DISTINCT o.customer_id) AS customer_count,
SUM(o.total_amount) AS total_revenue,
SUM(oi.quantity) AS total_units_sold,
AVG(o.total_amount) AS avg_order_value,
SUM(o.shipping_cost) AS total_shipping,
SUM(o.tax_amount) AS total_tax,
SUM(o.discount_amount) AS total_discount
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
WHERE o.order_date >= CURRENT_DATE - INTERVAL '90 days'
AND o.order_status = 'delivered'
GROUP BY DATE(o.order_date)
ORDER BY sale_date DESC;
-- Index voor de materialized view
CREATE UNIQUE INDEX idx_mv_daily_sales_date ON mv_daily_sales(sale_date);
-- Function om materialized view te refreshen
CREATE OR REPLACE FUNCTION refresh_daily_sales()
RETURNS VOID AS $$
BEGIN
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_daily_sales;
END;
$$ LANGUAGE plpgsql;
Klaar voor data modeling projecten?
Vind de juiste experts of plaats je Data Modeling vacature
6. Database Normalisatie (1NF t/m 5NF)
Wat is Database Normalisatie?
Database Normalisatie is een proces om een database structuur te organiseren volgens een reeks regels die data redundancy verminderen en data integriteit verbeteren.
Eerste Normaalvorm (1NF) - Atomic Values
-- ❌ VOOR 1NF: Niet-genormaliseerde tabel
CREATE TABLE orders_non_normalized (
order_id INT,
customer_name VARCHAR(100),
products VARCHAR(500), -- Meerdere producten in één kolom: "Product1,Product2"
quantities VARCHAR(100), -- "2,1,3"
prices VARCHAR(100) -- "10.99,24.50,5.99"
);
-- Problemen:
-- 1. Multi-value kolommen
-- 2. Niet-atomische waarden
-- 3. Moeilijk te queryen
-- 4. Data integriteit issues
-- ✅ NA 1NF: Genormaliseerd naar atomic values
CREATE TABLE orders_1nf (
order_id INT,
customer_id INT, -- Reference naar customers tabel
order_date DATE
);
CREATE TABLE order_items_1nf (
order_item_id INT,
order_id INT,
product_id INT, -- Reference naar products tabel
quantity INT,
unit_price DECIMAL(10, 2)
);
-- 1NF Regels:
-- 1. Elke kolom bevat atomic (ondeelbare) waarden
-- 2. Elke kolom bevat waarden van hetzelfde type
-- 3. Elke rij is uniek (geen duplicate rows)
-- 4. De volgorde van rows en columns is niet significant
Tweede Normaalvorm (2NF) - Remove Partial Dependencies
-- ❌ VOOR 2NF: Tabel met partial dependencies
CREATE TABLE order_details_2nf_violation (
order_id INT,
product_id INT,
product_name VARCHAR(100), -- Dependent alleen op product_id
category VARCHAR(50), -- Dependent alleen op product_id
quantity INT,
unit_price DECIMAL(10, 2),
order_date DATE, -- Dependent alleen op order_id
customer_id INT, -- Dependent alleen op order_id
PRIMARY KEY (order_id, product_id)
);
-- Problemen:
-- 1. Partial dependency: product_name hangt alleen van product_id af
-- 2. Partial dependency: order_date hangt alleen van order_id af
-- 3. Update anomalies: Als productnaam wijzigt, moeten alle orders geüpdatet worden
-- 4. Insert anomalies: Kan geen product toevoegen zonder order
-- 5. Delete anomalies: Als laatste order voor product verwijderd wordt, gaat productinfo verloren
-- ✅ NA 2NF: Verwijder partial dependencies
CREATE TABLE orders_2nf (
order_id INT PRIMARY KEY,
order_date DATE,
customer_id INT,
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);
CREATE TABLE products_2nf (
product_id INT PRIMARY KEY,
product_name VARCHAR(100),
category VARCHAR(50),
standard_price DECIMAL(10, 2)
);
CREATE TABLE order_items_2nf (
order_id INT,
product_id INT,
quantity INT,
unit_price DECIMAL(10, 2), -- Prijs op moment van bestelling
PRIMARY KEY (order_id, product_id),
FOREIGN KEY (order_id) REFERENCES orders_2nf(order_id),
FOREIGN KEY (product_id) REFERENCES products_2nf(product_id)
);
-- 2NF Regels:
-- 1. Voldoet aan 1NF
-- 2. Alle non-key attributen zijn fully functional dependent op de primary key
-- 3. Geen partial dependencies (alle attributen moeten van de hele PK afhangen)
Derde Normaalvorm (3NF) - Remove Transitive Dependencies
-- ❌ VOOR 3NF: Tabel met transitive dependencies
CREATE TABLE orders_3nf_violation (
order_id INT PRIMARY KEY,
customer_id INT,
customer_name VARCHAR(100), -- Transitive: hangt af van customer_id
customer_city VARCHAR(50), -- Transitive: hangt af van customer_id
customer_country VARCHAR(50), -- Transitive: hangt af van customer_city
order_date DATE,
total_amount DECIMAL(10, 2)
);
-- Transitive dependencies:
-- order_id → customer_id → customer_name
-- order_id → customer_id → customer_city → customer_country
-- Problemen:
-- 1. Update anomaly: Customer naam wijzigen in alle orders
-- 2. Insert anomaly: Kan niet city/country zonder customer
-- 3. Delete anomaly: Verwijder customer, verlies city/country info
-- 4. Data inconsistency: Zelfde city kan verschillende country hebben
-- ✅ NA 3NF: Verwijder transitive dependencies
CREATE TABLE customers_3nf (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100),
city_id INT,
FOREIGN KEY (city_id) REFERENCES cities(city_id)
);
CREATE TABLE cities_3nf (
city_id INT PRIMARY KEY,
city_name VARCHAR(50),
country_id INT,
FOREIGN KEY (country_id) REFERENCES countries(country_id)
);
CREATE TABLE countries_3nf (
country_id INT PRIMARY KEY,
country_name VARCHAR(50),
country_code VARCHAR(2)
);
CREATE TABLE orders_3nf (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE,
total_amount DECIMAL(10, 2),
FOREIGN KEY (customer_id) REFERENCES customers_3nf(customer_id)
);
-- 3NF Regels:
-- 1. Voldoet aan 2NF
-- 2. Geen transitive dependencies
-- 3. Alle non-key attributen zijn direct dependent op primary key
-- "A relation is in 3NF if every non-prime attribute is non-transitively dependent on every candidate key"
Boyce-Codd Normaalvorm (BCNF)
-- ❌ BCNF VIOLATION: Overlapping candidate keys
CREATE TABLE course_enrollments (
student_id INT,
course_id INT,
instructor_id INT,
classroom VARCHAR(20),
PRIMARY KEY (student_id, course_id)
);
-- Assumptions:
-- 1. Each course has exactly one instructor
-- 2. Each instructor teaches only one course
-- 3. Each course is in one classroom
-- Functional dependencies:
-- (student_id, course_id) → instructor_id, classroom
-- course_id → instructor_id, classroom
-- instructor_id → course_id
-- Problemen:
-- 1. course_id → instructor_id (non-prime → non-prime)
-- 2. But course_id is not a superkey
-- 3. Violates BCNF: For every FD X → Y, X must be superkey
-- ✅ BCNF NORMALIZATION
CREATE TABLE courses_bcnf (
course_id INT PRIMARY KEY,
instructor_id INT UNIQUE, -- One instructor per course
classroom VARCHAR(20)
);
CREATE TABLE enrollments_bcnf (
student_id INT,
course_id INT,
PRIMARY KEY (student_id, course_id),
FOREIGN KEY (course_id) REFERENCES courses_bcnf(course_id)
);
-- BCNF Regels:
-- 1. Voldoet aan 3NF
-- 2. Voor elke functionele dependency X → Y, moet X een superkey zijn
-- 3. Strenger dan 3NF: elimineert alle anomaly-causing functional dependencies
-- 4. In BCNF: elke determinant is een candidate key
Normalisatie Samenvatting
| Normaalvorm | Doel | Regel | Voordelen |
|---|---|---|---|
| 1NF | Atomic values | Geen multi-value of composite attributen | Eenvoudiger querying, data integriteit |
| 2NF | Remove partial dependencies | Alle non-key attributen fully dependent op PK | Vermindert update anomalies |
| 3NF | Remove transitive dependencies | Non-key attributen alleen dependent op PK | Vermindert data redundancy |
| BCNF | Strenger dan 3NF | Elke determinant is candidate key | Elimineert alle anomaly-causing FDs |
| 4NF | Remove multi-valued dependencies | Geen non-trivial MVDs | Handelt multi-value attributen |
| 5NF | Remove join dependencies | Elke join dependency implied by candidate keys | Complete decompositie zonder data loss |
7. Wanneer denormaliseren?
Denormalisatie Trade-offs
Denormalisatie is het bewust introduceren van redundancy in een genormaliseerd database design om query performance te verbeteren, vaak ten koste van data integriteit en update efficiency.
Wanneer Denormaliseren Toepassen
Read Performance
- Read-heavy applicaties
- Complexe joins vermijden
- Frequent gelezen berekende waarden
- Data warehouse scenario's
Aggregatie
- Pre-calculated aggregaties
- Summary tables
- Reporting requirements
- Real-time dashboards
Synchronisatie
- Slow-changing dimensions
- Historical data preserveren
- Audit trails
- Versioning requirements
Hardware Limits
- Limited join capabilities
- Distributed systems
- NoSQL databases
- Memory constraints
Denormalisatie Technieken
-- TECHNIQUE 1: Derived Columns (Berekenende kolommen)
-- Voeg een total_price kolom toe om berekeningen te vermijden
ALTER TABLE order_items
ADD COLUMN line_total DECIMAL(10, 2) GENERATED ALWAYS AS
(quantity * unit_price * (1 - discount_percentage/100)) STORED;
ALTER TABLE orders
ADD COLUMN item_count INTEGER GENERATED ALWAYS AS
(SELECT COUNT(*) FROM order_items WHERE order_id = orders.order_id) STORED;
-- TECHNIQUE 2: Flattened Hierarchies (Afgeplatte hiërarchieën)
-- Sla categorie pad op in product tabel
ALTER TABLE products
ADD COLUMN category_path VARCHAR(200); -- Bijv: "Electronics/Computers/Laptops"
-- Update trigger om path bij te werken
CREATE OR REPLACE FUNCTION update_category_path()
RETURNS TRIGGER AS $$
BEGIN
WITH RECURSIVE category_tree AS (
SELECT category_id, category_name, parent_category_id
FROM categories
WHERE category_id = NEW.category_id
UNION ALL
SELECT c.category_id, c.category_name, c.parent_category_id
FROM categories c
INNER JOIN category_tree ct ON c.category_id = ct.parent_category_id
)
SELECT string_agg(category_name, '/' ORDER BY category_id)
INTO NEW.category_path
FROM category_tree;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
-- TECHNIQUE 3: Summary Tables (Samenvattings tabellen)
CREATE TABLE daily_sales_summary (
sale_date DATE PRIMARY KEY,
order_count INTEGER,
total_revenue DECIMAL(15, 2),
customer_count INTEGER,
avg_order_value DECIMAL(10, 2),
top_product_id INTEGER,
top_product_qty INTEGER,
last_updated TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);
-- Batch job om summary bij te werken
INSERT INTO daily_sales_summary
SELECT
DATE(o.order_date) AS sale_date,
COUNT(DISTINCT o.order_id) AS order_count,
SUM(o.total_amount) AS total_revenue,
COUNT(DISTINCT o.customer_id) AS customer_count,
AVG(o.total_amount) AS avg_order_value,
(
SELECT product_id
FROM order_items oi
WHERE oi.order_id = o.order_id
GROUP BY product_id
ORDER BY SUM(quantity) DESC
LIMIT 1
) AS top_product_id,
CURRENT_TIMESTAMP
FROM orders o
WHERE o.order_date >= CURRENT_DATE - INTERVAL '1 day'
GROUP BY DATE(o.order_date)
ON CONFLICT (sale_date) DO UPDATE SET
order_count = EXCLUDED.order_count,
total_revenue = EXCLUDED.total_revenue,
customer_count = EXCLUDED.customer_count,
avg_order_value = EXCLUDED.avg_order_value,
last_updated = CURRENT_TIMESTAMP;
8. Data Warehouse Modeling
Dimensional Modeling Principles
Dimensional Modeling is een design techniek voor data warehouses die data structureert in feiten (meetbare gebeurtenissen) en dimensies (beschrijvende attributen) voor optimale query performance en gebruiksgemak.
Star Schema Design
-- STAR SCHEMA: Sales Data Warehouse
-- FACT TABLE: Sales Facts
CREATE TABLE fact_sales (
sales_key BIGSERIAL PRIMARY KEY,
date_key INTEGER NOT NULL,
product_key INTEGER NOT NULL,
customer_key INTEGER NOT NULL,
store_key INTEGER NOT NULL,
promotion_key INTEGER,
-- Measures (Feiten)
quantity_sold INTEGER NOT NULL,
unit_price DECIMAL(10, 2) NOT NULL,
sales_amount DECIMAL(15, 2) NOT NULL,
cost_amount DECIMAL(15, 2) NOT NULL,
discount_amount DECIMAL(10, 2) DEFAULT 0,
tax_amount DECIMAL(10, 2) NOT NULL,
-- Calculated measures
gross_profit DECIMAL(15, 2) GENERATED ALWAYS AS
(sales_amount - cost_amount - discount_amount) STORED,
profit_margin DECIMAL(5, 2) GENERATED ALWAYS AS
CASE WHEN sales_amount > 0
THEN ((sales_amount - cost_amount - discount_amount) / sales_amount) * 100
ELSE 0
END STORED,
-- Foreign keys
FOREIGN KEY (date_key) REFERENCES dim_date(date_key),
FOREIGN KEY (product_key) REFERENCES dim_product(product_key),
FOREIGN KEY (customer_key) REFERENCES dim_customer(customer_key),
FOREIGN KEY (store_key) REFERENCES dim_store(store_key),
FOREIGN KEY (promotion_key) REFERENCES dim_promotion(promotion_key)
);
-- DIMENSION TABLE: Date Dimension
CREATE TABLE dim_date (
date_key INTEGER PRIMARY KEY, -- YYYYMMDD format
full_date DATE NOT NULL UNIQUE,
-- Date attributes
day_of_week INTEGER NOT NULL, -- 1=Monday, 7=Sunday
day_name VARCHAR(10) NOT NULL, -- Monday, Tuesday...
day_of_month INTEGER NOT NULL,
day_of_year INTEGER NOT NULL,
week_of_year INTEGER NOT NULL,
month_number INTEGER NOT NULL,
month_name VARCHAR(10) NOT NULL,
quarter_number INTEGER NOT NULL,
quarter_name CHAR(2) NOT NULL, -- Q1, Q2, Q3, Q4
year_number INTEGER NOT NULL,
-- Flags
is_weekend BOOLEAN NOT NULL,
is_holiday BOOLEAN NOT NULL,
is_business_day BOOLEAN NOT NULL,
-- Fiscal calendar (indien afwijkend)
fiscal_year INTEGER,
fiscal_quarter INTEGER,
fiscal_month INTEGER,
-- Holiday info
holiday_name VARCHAR(50),
holiday_type VARCHAR(20)
);
-- DIMENSION TABLE: Product Dimension
CREATE TABLE dim_product (
product_key INTEGER PRIMARY KEY,
product_id INTEGER NOT NULL, -- Business key
sku VARCHAR(50) NOT NULL,
product_name VARCHAR(200) NOT NULL,
description TEXT,
-- Product hierarchy
category_key INTEGER,
category_name VARCHAR(100) NOT NULL,
subcategory_key INTEGER,
subcategory_name VARCHAR(100),
department_key INTEGER,
department_name VARCHAR(100),
-- Product attributes
brand VARCHAR(50),
color VARCHAR(30),
size VARCHAR(20),
weight_kg DECIMAL(6, 3),
unit_price DECIMAL(10, 2),
cost_price DECIMAL(10, 2),
-- SCD Type 2 columns
effective_date DATE NOT NULL,
expiration_date DATE,
is_current BOOLEAN NOT NULL DEFAULT TRUE,
-- Audit columns
created_timestamp TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
updated_timestamp TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);
-- DIMENSION TABLE: Customer Dimension
CREATE TABLE dim_customer (
customer_key INTEGER PRIMARY KEY,
customer_id INTEGER NOT NULL, -- Business key
customer_type VARCHAR(20) NOT NULL, -- B2C, B2B
first_name VARCHAR(50),
last_name VARCHAR(50),
full_name VARCHAR(101) GENERATED ALWAYS AS
(first_name || ' ' || last_name) STORED,
email VARCHAR(100),
-- Demographics
birth_date DATE,
age INTEGER,
gender VARCHAR(10),
-- Geography
city VARCHAR(100),
state_province VARCHAR(100),
country VARCHAR(50),
postal_code VARCHAR(20),
region VARCHAR(50),
-- Customer segmentation
customer_segment VARCHAR(50), -- Gold, Silver, Bronze
lifetime_value DECIMAL(15, 2),
acquisition_channel VARCHAR(50),
acquisition_date DATE,
-- B2B specific
company_name VARCHAR(100),
industry VARCHAR(50),
employee_count_range VARCHAR(20),
-- SCD Type 2
effective_date DATE NOT NULL,
expiration_date DATE,
is_current BOOLEAN NOT NULL DEFAULT TRUE
);
9. NoSQL Data Modeling
NoSQL Modeling Principles
NoSQL Data Modeling volgt verschillende principes dan relationele modellen, met focus op query patterns, denormalisatie, en applicatie-specifieke structuren.
Document Database Modeling (MongoDB)
// MONGODB DOCUMENT MODELING
// Denormalized approach voor read performance
// Collection: customers
{
"_id": ObjectId("507f1f77bcf86cd799439011"),
"customer_id": "CUST-1001",
"name": {
"first": "John",
"last": "Doe"
},
"email": "john.doe@example.com",
"phone": "+31612345678",
"addresses": [
{
"type": "shipping",
"street": "Main Street 123",
"city": "Amsterdam",
"postal_code": "1011AB",
"country": "Netherlands",
"is_primary": true
},
{
"type": "billing",
"street": "Business Ave 456",
"city": "Rotterdam",
"postal_code": "3011AA",
"country": "Netherlands",
"is_primary": true
}
],
"preferences": {
"language": "nl",
"currency": "EUR",
"timezone": "Europe/Amsterdam",
"email_marketing": true
},
"metadata": {
"created_at": ISODate("2024-01-15T10:30:00Z"),
"updated_at": ISODate("2024-03-20T14:45:00Z"),
"source": "website",
"ip_address": "192.168.1.1"
},
// Embedded orders voor snelle toegang
"recent_orders": [
{
"order_id": "ORD-2024-001",
"order_date": ISODate("2024-03-15T09:30:00Z"),
"status": "delivered",
"total_amount": 149.99,
"items_count": 3
},
{
"order_id": "ORD-2024-002",
"order_date": ISODate("2024-02-28T14:20:00Z"),
"status": "delivered",
"total_amount": 89.50,
"items_count": 1
}
],
// Calculated fields
"stats": {
"total_orders": 15,
"total_spent": 2450.75,
"average_order_value": 163.38,
"last_order_date": ISODate("2024-03-15T09:30:00Z")
}
}
// Collection: orders (embedded approach)
{
"_id": ObjectId("507f1f77bcf86cd799439012"),
"order_id": "ORD-2024-001",
"order_number": "20240001",
"customer": {
"customer_id": "CUST-1001",
"name": "John Doe",
"email": "john.doe@example.com"
},
"order_date": ISODate("2024-03-15T09:30:00Z"),
"status": "delivered",
"status_history": [
{
"status": "pending",
"timestamp": ISODate("2024-03-15T09:30:00Z"),
"notes": "Order placed"
},
{
"status": "confirmed",
"timestamp": ISODate("2024-03-15T09:35:00Z"),
"notes": "Payment received"
},
{
"status": "shipped",
"timestamp": ISODate("2024-03-16T14:20:00Z"),
"notes": "Shipped via DHL"
},
{
"status": "delivered",
"timestamp": ISODate("2024-03-18T10:15:00Z"),
"notes": "Delivered to customer"
}
],
// Embedded items (denormalized product data)
"items": [
{
"product_id": "PROD-001",
"sku": "LAPTOP-X1",
"name": "High Performance Laptop",
"quantity": 1,
"unit_price": 1299.99,
"discount_percentage": 10,
"category": "Electronics/Laptops",
"brand": "TechBrand",
"weight_kg": 2.5
},
{
"product_id": "PROD-002",
"sku": "MOUSE-G5",
"name": "Gaming Mouse",
"quantity": 2,
"unit_price": 49.99,
"category": "Electronics/Peripherals",
"brand": "GameGear"
}
],
"shipping": {
"address": {
"street": "Main Street 123",
"city": "Amsterdam",
"postal_code": "1011AB",
"country": "Netherlands"
},
"method": "express",
"cost": 9.99,
"tracking_number": "DHL123456789",
"estimated_delivery": ISODate("2024-03-18T23:59:59Z")
},
"payment": {
"method": "credit_card",
"status": "paid",
"transaction_id": "TXN-789012",
"amount": 149.99,
"currency": "EUR"
},
"totals": {
"items_total": 1399.97,
"shipping": 9.99,
"discount": 129.99,
"tax": 268.19,
"grand_total": 1547.16
},
"metadata": {
"created_at": ISODate("2024-03-15T09:30:00Z"),
"updated_at": ISODate("2024-03-18T10:15:00Z"),
"source": "website",
"ip_address": "192.168.1.100",
"user_agent": "Mozilla/5.0..."
}
}
// Indexes voor optimalisatie
db.customers.createIndex({ "email": 1 }, { unique: true });
db.customers.createIndex({ "customer_id": 1 }, { unique: true });
db.customers.createIndex({ "name.last": 1, "name.first": 1 });
db.customers.createIndex({ "addresses.city": 1 });
db.customers.createIndex({ "stats.total_spent": -1 });
db.orders.createIndex({ "order_id": 1 }, { unique: true });
db.orders.createIndex({ "customer.customer_id": 1 });
db.orders.createIndex({ "order_date": -1 });
db.orders.createIndex({ "status": 1, "order_date": -1 });
db.orders.createIndex({ "items.product_id": 1 });
db.orders.createIndex({ "totals.grand_total": -1 });
10. Data Modeling Best Practices
Essentiële Richtlijnen
Volg deze best practices voor effectief en schaalbaar data modeling.
Business Focus
- Start met business requirements
- Betrek stakeholders vroeg
- Model vanuit business processen
- Documenteer business rules
Schaalbaarheid
- Design voor toekomstige groei
- Overweeg partitioning vroeg
- Houd performance in gedachten
- Plan voor data volume groei
Data Integriteit
- Implementeer constraints
- Gebruik foreign keys
- Valideer data types
- Handel null waarden consistent
Documentatie
- Documenteer assumpties
- Behoud model versies
- Documenteer wijzigingen
- Maak ERD diagrammen
Praktische Tips
-- TIP 1: Gebruik consistente naming conventions
-- Tabellen: meervoud, snake_case
-- Kolommen: snake_case
-- Primary keys: tablename_id
-- Foreign keys: referencedtable_id
-- TIP 2: Implementeer audit columns
ALTER TABLE customers
ADD COLUMN created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
ADD COLUMN updated_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
ADD COLUMN created_by VARCHAR(50),
ADD COLUMN updated_by VARCHAR(50);
-- TIP 3: Gebruik soft delete i.p.v. hard delete
ALTER TABLE products
ADD COLUMN is_deleted BOOLEAN DEFAULT FALSE,
ADD COLUMN deleted_at TIMESTAMPTZ,
ADD COLUMN deleted_by VARCHAR(50);
-- TIP 4: Normaliseer eerst, denormaliseer later
-- 1. Begin met genormaliseerd model (3NF)
-- 2. Meet performance
-- 3. Denormaliseer alleen waar nodig
-- 4. Documenteer denormalisatie beslissingen
-- TIP 5: Gebruik database features
-- • Generated columns voor berekende waarden
-- • JSON columns voor flexibele data
-- • Array types voor lists
-- • Full-text search voor text queries
11. Tools en Frameworks
Populaire Data Modeling Tools
Selecteer de juiste tools voor jouw data modeling behoeften.
Enterprise Tools
- ER/Studio: Comprehensive data architecture
- IBM InfoSphere Data Architect: Enterprise modeling
- Oracle SQL Developer Data Modeler: Oracle ecosystem
- SAP PowerDesigner: Business process modeling
Open Source Tools
- pgModeler: PostgreSQL specific modeling
- MySQL Workbench: MySQL visual design
- DBeaver: Universal database tool
- DbSchema: Visual database designer
Cloud Tools
- Lucidchart: Cloud-based diagramming
- Draw.io: Free diagram tool
- dbdiagram.io: Database diagram as code
- QuickDBD: Text to diagram tool
Code-First Tools
- SQLAlchemy: Python ORM
- Entity Framework: .NET ORM
- Prisma: Node.js ORM
- Django ORM: Django framework
Conclusie
Data modeling is een kritische discipline die de basis legt voor succesvolle data-driven applicaties. Door de principes van conceptual, logical, en physical modeling te volgen, samen met gepaste normalisatie en denormalisatie technieken, kunnen ontwikkelaars schaalbare, performante en onderhoudbare databases ontwerpen.
Key Takeaways:
- Start met business requirements en werk van conceptueel naar fysiek model
- Normaliseer tot 3NF voor transactionele systemen, denormaliseer voor analytics
- Kies het juiste model (relational, dimensional, document) voor je use case
- Documenteer je modellen en behoud versie controle
- Test je modellen met realistische data volumes en query patterns
Klaar om je data modeling skills toe te passen?
Vind data modeling vacatures of plaats je eigen vacature voor €25