Database Design: Complete Gids voor Effectieve Data Structuur
Leer de fundamenten en geavanceerde technieken van professioneel database design. Van normalisatie principes tot performance optimalisatie en moderne database architecturen.
Zoek je Database Experts?
Vind ervaren Database Architects en SQL Developers voor je database projecten
Inhoudsopgave
- Inleiding tot Database Design
- Normalisatie: 1NF tot 5NF
- Entity-Relationship Modeling
- Data Types en Constraints
- Indexering Strategieën
- Database Performance Optimalisatie
- Schaalbaarheid en High Availability
- NoSQL Database Design
- Database Security Best Practices
- Database Migration Strategieën
- Database Design Tools
- Praktijk Case Studies
1. Inleiding tot Database Design
Wat is Database Design?
Database design is het proces van het ontwerpen van de structuur, organisatie en relaties van data in een database systeem. Een goed database design is essentieel voor performance, schaalbaarheid, onderhoudbaarheid en data integriteit.
Data Structuur
Definieer tabellen, kolommen, data types en relaties tussen entiteiten
Performance
Optimaliseer voor snelle query execution en efficiënt data retrieval
Data Integriteit
Zorg voor accurate, consistente en betrouwbare data via constraints
Schaalbearheid
Ontwerp voor groei in data volume en gebruikerstoegang
| Database Type | Use Case | Voordelen | Nadelen | Voorbeelden |
|---|---|---|---|---|
| Relationeel (RDBMS) | Structured data, complexe queries, ACID compliance | Data integriteit, mature ecosystem, SQL standaard | Verticale schaalbaarheid, schema rigiditeit | PostgreSQL, MySQL, SQL Server |
| Document | Flexibele schema's, semi-structured data | Flexibiliteit, horizontale schaalbaarheid, JSON support | Geen joins, beperkte ACID | MongoDB, CouchDB |
| Key-Value | Caching, sessie opslag, real-time data | Extreme performance, eenvoudig model | Beperkte query mogelijkheden | Redis, DynamoDB |
| Column-Family | Big data, time-series, IoT data | Horizontale schaalbaarheid, snelle writes | Complexe data modeling | Cassandra, HBase |
| Graph | Social networks, recommendation engines | Snelle relationship queries, flexibel schema | Niet optimaal voor tabulaire data | Neo4j, Amazon Neptune |
2. Normalisatie: 1NF tot 5NF
Database Normalisatie Principes
Normalisatie is het proces van het organiseren van data in een database om redundantie te verminderen en data integriteit te verbeteren. Het volgt een reeks normaalvormen van 1NF tot 5NF.
Praktijkvoorbeeld: Denormalized naar 3NF
-- VOORBEELD: E-commerce systeem - Denormalized tabel
-- Probleem: Data redundantie, update anomalies
CREATE TABLE denormalized_orders (
order_id INT PRIMARY KEY,
customer_id INT,
customer_name VARCHAR(100),
customer_email VARCHAR(255),
customer_phone VARCHAR(50),
customer_address VARCHAR(500),
product_id INT,
product_name VARCHAR(200),
product_category VARCHAR(100),
product_price DECIMAL(10, 2),
quantity INT,
order_date DATE,
order_status VARCHAR(50),
shipping_method VARCHAR(100),
shipping_cost DECIMAL(8, 2),
tax_amount DECIMAL(10, 2),
total_amount DECIMAL(12, 2)
);
-- 1NF (Eerste Normaalvorm): Atomische waarden
-- Oplossing: Split multivalue attributes, verwijder repeating groups
CREATE TABLE orders_1nf (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE,
order_status VARCHAR(50),
shipping_method VARCHAR(100),
shipping_cost DECIMAL(8, 2),
tax_amount DECIMAL(10, 2),
total_amount DECIMAL(12, 2)
);
CREATE TABLE order_items_1nf (
order_item_id INT PRIMARY KEY,
order_id INT,
product_id INT,
quantity INT,
unit_price DECIMAL(10, 2),
FOREIGN KEY (order_id) REFERENCES orders_1nf(order_id)
);
-- 2NF (Tweede Normaalvorm): Geen gedeeltelijke afhankelijkheden
-- Oplossing: Verwijder afhankelijkheden van deel van primary key
CREATE TABLE products_2nf (
product_id INT PRIMARY KEY,
product_name VARCHAR(200),
product_category VARCHAR(100),
standard_price DECIMAL(10, 2)
);
CREATE TABLE order_items_2nf (
order_item_id INT PRIMARY KEY,
order_id INT,
product_id INT,
quantity INT,
sold_price DECIMAL(10, 2), -- Prijs op moment van verkoop
FOREIGN KEY (order_id) REFERENCES orders_1nf(order_id),
FOREIGN KEY (product_id) REFERENCES products_2nf(product_id)
);
-- 3NF (Derde Normaalvorm): Geen transitieve afhankelijkheden
-- Oplossing: Verwijder niet-sleutel kolommen die van andere niet-sleutel kolommen afhangen
CREATE TABLE customers_3nf (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100),
customer_email VARCHAR(255),
customer_phone VARCHAR(50)
);
CREATE TABLE customer_addresses_3nf (
address_id INT PRIMARY KEY,
customer_id INT,
address_type VARCHAR(20), -- 'billing', 'shipping'
street_address VARCHAR(255),
city VARCHAR(100),
postal_code VARCHAR(20),
country VARCHAR(100),
FOREIGN KEY (customer_id) REFERENCES customers_3nf(customer_id)
);
CREATE TABLE orders_3nf (
order_id INT PRIMARY KEY,
customer_id INT,
billing_address_id INT,
shipping_address_id INT,
order_date DATE,
order_status VARCHAR(50),
shipping_method VARCHAR(100),
shipping_cost DECIMAL(8, 2),
tax_amount DECIMAL(10, 2),
total_amount DECIMAL(12, 2),
FOREIGN KEY (customer_id) REFERENCES customers_3nf(customer_id),
FOREIGN KEY (billing_address_id) REFERENCES customer_addresses_3nf(address_id),
FOREIGN KEY (shipping_address_id) REFERENCES customer_addresses_3nf(address_id)
);
-- BCNF (Boyce-Codd Normaalvorm): Sterkere variant van 3NF
CREATE TABLE product_categories_bcnf (
category_id INT PRIMARY KEY,
category_name VARCHAR(100) UNIQUE,
parent_category_id INT,
FOREIGN KEY (parent_category_id) REFERENCES product_categories_bcnf(category_id)
);
CREATE TABLE products_bcnf (
product_id INT PRIMARY KEY,
product_name VARCHAR(200) UNIQUE,
category_id INT,
standard_price DECIMAL(10, 2),
FOREIGN KEY (category_id) REFERENCES product_categories_bcnf(category_id)
);
Advanced Normalisatie: 4NF en 5NF
-- 4NF (Vierde Normaalvorm): Geen multivalued dependencies
-- Voorbeeld: Studenten, cursussen en boeken
-- Probleem: Multivalued dependency in één tabel
CREATE TABLE student_course_book (
student_id INT,
course_id INT,
book_isbn VARCHAR(20),
PRIMARY KEY (student_id, course_id, book_isbn)
);
-- In deze tabel hebben we twee onafhankelijke multivalued dependencies:
-- 1. student_id →→ course_id
-- 2. student_id →→ book_isbn
-- Oplossing: Split in twee tabellen
CREATE TABLE student_courses_4nf (
student_id INT,
course_id INT,
PRIMARY KEY (student_id, course_id)
);
CREATE TABLE student_books_4nf (
student_id INT,
book_isbn VARCHAR(20),
PRIMARY KEY (student_id, book_isbn)
);
CREATE TABLE course_books_4nf (
course_id INT,
book_isbn VARCHAR(20),
PRIMARY KEY (course_id, book_isbn)
);
-- 5NF (Vijfde Normaalvorm): Projectie-join normaalvorm
-- Voorbeeld: Leveranciers, onderdelen en projecten
CREATE TABLE supplier_part_project_5nf (
supplier_id INT,
part_id INT,
project_id INT,
quantity INT,
PRIMARY KEY (supplier_id, part_id, project_id)
);
-- Deze tabel kan worden gereconstrueerd door joins van drie binaire tabellen
CREATE TABLE supplier_parts_5nf (
supplier_id INT,
part_id INT,
PRIMARY KEY (supplier_id, part_id)
);
CREATE TABLE supplier_projects_5nf (
supplier_id INT,
project_id INT,
PRIMARY KEY (supplier_id, project_id)
);
CREATE TABLE part_projects_5nf (
part_id INT,
project_id INT,
PRIMARY KEY (part_id, project_id)
);
-- De originele tabel kan worden gereconstrueerd met:
SELECT sp.supplier_id, sp.part_id, spr.project_id
FROM supplier_parts_5nf sp
JOIN supplier_projects_5nf spr ON sp.supplier_id = spr.supplier_id
JOIN part_projects_5nf pp ON sp.part_id = pp.part_id
AND spr.project_id = pp.project_id;
3. Entity-Relationship Modeling
ERD Design Principes
Entity-Relationship Modeling (ERD) is een visuele techniek voor het ontwerpen van database structuren die entiteiten, hun attributen en de relaties daartussen weergeeft.
Basic ERD Concepts en Implementatie
-- ENTITIES: Kern objecten in het systeem
-- Customer Entity
CREATE TABLE customer (
customer_id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
phone VARCHAR(20),
date_of_birth DATE,
registration_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
-- Check constraints voor data integriteit
CONSTRAINT chk_email_format
CHECK (email LIKE '%@%.%'),
CONSTRAINT chk_age_adult
CHECK (
date_of_birth IS NULL
OR EXTRACT(YEAR FROM AGE(date_of_birth)) >= 18
)
);
-- Product Entity
CREATE TABLE product (
product_id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
product_code VARCHAR(50) UNIQUE NOT NULL,
product_name VARCHAR(200) NOT NULL,
description TEXT,
category_id INT NOT NULL,
unit_price DECIMAL(10, 2) NOT NULL,
cost_price DECIMAL(10, 2) NOT NULL,
stock_quantity INT DEFAULT 0,
reorder_level INT DEFAULT 10,
is_active BOOLEAN DEFAULT TRUE,
created_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
modified_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT chk_positive_price
CHECK (unit_price > 0 AND cost_price > 0),
CONSTRAINT chk_stock_nonnegative
CHECK (stock_quantity >= 0)
);
-- RELATIONSHIPS: Verbindingen tussen entiteiten
-- One-to-Many: Customer → Orders
CREATE TABLE "order" ( -- order is een gereserveerd woord, dus quotes
order_id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
customer_id INT NOT NULL,
order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
required_date DATE,
shipped_date DATE,
order_status VARCHAR(20) DEFAULT 'pending',
comments TEXT,
CONSTRAINT fk_order_customer
FOREIGN KEY (customer_id)
REFERENCES customer(customer_id)
ON DELETE RESTRICT,
CONSTRAINT chk_order_dates
CHECK (
shipped_date IS NULL
OR shipped_date >= order_date::DATE
),
CONSTRAINT chk_valid_status
CHECK (order_status IN (
'pending', 'processing', 'shipped',
'delivered', 'cancelled'
))
);
-- Many-to-Many: Orders ↔ Products via Order_Items
CREATE TABLE order_item (
order_item_id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
order_id INT NOT NULL,
product_id INT NOT NULL,
quantity INT NOT NULL,
unit_price DECIMAL(10, 2) NOT NULL,
discount DECIMAL(4, 2) DEFAULT 0,
CONSTRAINT fk_order_item_order
FOREIGN KEY (order_id)
REFERENCES "order"(order_id)
ON DELETE CASCADE,
CONSTRAINT fk_order_item_product
FOREIGN KEY (product_id)
REFERENCES product(product_id)
ON DELETE RESTRICT,
CONSTRAINT chk_positive_quantity
CHECK (quantity > 0),
CONSTRAINT chk_valid_discount
CHECK (discount >= 0 AND discount <= 1),
CONSTRAINT uq_order_product
UNIQUE (order_id, product_id)
);
-- One-to-One: Customer → Customer_Profile (extensie tabel)
CREATE TABLE customer_profile (
customer_id INT PRIMARY KEY,
preferences JSONB,
newsletter_opt_in BOOLEAN DEFAULT FALSE,
last_login TIMESTAMP,
login_count INT DEFAULT 0,
profile_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT fk_customer_profile_customer
FOREIGN KEY (customer_id)
REFERENCES customer(customer_id)
ON DELETE CASCADE
);
-- WEAK ENTITIES: Bestaan alleen in relatie tot een sterke entiteit
CREATE TABLE address (
address_id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
customer_id INT NOT NULL,
address_type VARCHAR(20) NOT NULL,
street VARCHAR(255) NOT NULL,
city VARCHAR(100) NOT NULL,
state VARCHAR(100),
postal_code VARCHAR(20) NOT NULL,
country VARCHAR(100) NOT NULL,
is_primary BOOLEAN DEFAULT FALSE,
CONSTRAINT fk_address_customer
FOREIGN KEY (customer_id)
REFERENCES customer(customer_id)
ON DELETE CASCADE,
CONSTRAINT chk_valid_address_type
CHECK (address_type IN (
'billing', 'shipping', 'home', 'work'
))
);
Database Experts Nodig?
Vind ervaren Database Architects en SQL Developers voor je database projecten
4. Data Types en Constraints
Database Constraints en Validatie
Data types en constraints zorgen voor data integriteit, performance optimalisatie en consistentie in database systemen.
Comprehensive Data Types en Constraints
-- GEAVANCEERDE DATA TYPES EN CONSTRAINTS VOOR POSTGRESQL
CREATE TABLE advanced_data_types (
-- Numerieke types met precisie
id BIGSERIAL PRIMARY KEY, -- Auto-increment 64-bit
small_number SMALLINT, -- -32,768 to 32,767
integer_number INT, -- -2,147,483,648 to 2,147,483,647
big_number BIGINT, -- -9.22e18 to 9.22e18
-- Decimal types voor financiële data
price DECIMAL(10, 2) NOT NULL DEFAULT 0.00,
tax_rate NUMERIC(5, 4) DEFAULT 0.21, -- 21% BTW
-- String types
short_text VARCHAR(50),
long_text TEXT,
fixed_length CHAR(10), -- Altijd 10 karakters
email VARCHAR(255),
-- Date/Time types
created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
event_date DATE,
event_time TIME,
event_timestamp TIMESTAMP,
-- Speciale types
is_active BOOLEAN DEFAULT TRUE,
status VARCHAR(20),
tags VARCHAR(100)[], -- Array van strings
metadata JSONB, -- Binary JSON (geïndexeerd)
preferences JSON, -- Plain JSON
ip_address INET, -- IP address type
mac_address MACADDR, -- MAC address type
-- UUID voor distributed systems
uuid_col UUID DEFAULT gen_random_uuid(),
-- Enumerated type
priority PRIORITY_LEVEL, -- Zie CREATE TYPE hieronder
-- Spatial data (PostGIS extension)
location GEOGRAPHY(POINT, 4326),
-- Range types
date_range DATERANGE,
numeric_range NUMRANGE
);
-- Create custom enumerated type
CREATE TYPE priority_level AS ENUM (
'low',
'medium',
'high',
'critical'
);
-- CREATE TABLE met uitgebreide constraints
CREATE TABLE employee (
employee_id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
employee_code VARCHAR(20) UNIQUE NOT NULL,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
phone VARCHAR(20),
date_of_birth DATE NOT NULL,
hire_date DATE NOT NULL,
salary DECIMAL(10, 2) NOT NULL,
department_id INT,
manager_id INT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
-- CHECK constraints voor complexe validatie
CONSTRAINT chk_valid_email
CHECK (
email ~* '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z]{2,}$'
),
CONSTRAINT chk_valid_phone
CHECK (
phone IS NULL
OR phone ~ '^\+?[1-9]\d{1,14}$' -- E.164 format
),
CONSTRAINT chk_age_requirement
CHECK (
EXTRACT(YEAR FROM AGE(date_of_birth)) >= 18
),
CONSTRAINT chk_hire_date
CHECK (hire_date >= date_of_birth),
CONSTRAINT chk_salary_range
CHECK (
salary BETWEEN 25000 AND 250000
),
CONSTRAINT chk_manager_hierarchy
CHECK (manager_id != employee_id), -- Kan niet je eigen manager zijn
-- FOREIGN KEY constraints
CONSTRAINT fk_employee_department
FOREIGN KEY (department_id)
REFERENCES department(department_id)
ON DELETE SET NULL,
CONSTRAINT fk_employee_manager
FOREIGN KEY (manager_id)
REFERENCES employee(employee_id)
ON DELETE SET NULL,
-- COMPOSITE UNIQUE constraint
CONSTRAINT uq_employee_name_dob
UNIQUE (first_name, last_name, date_of_birth)
);
-- DOMAIN creation voor herbruikbare validatie
CREATE DOMAIN email_address AS VARCHAR(255)
CHECK (
VALUE ~* '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z]{2,}$'
);
CREATE DOMAIN phone_number AS VARCHAR(20)
CHECK (
VALUE ~ '^\+?[1-9]\d{1,14}$'
);
CREATE DOMAIN positive_amount AS DECIMAL(10, 2)
CHECK (VALUE > 0);
CREATE DOMAIN percentage AS DECIMAL(5, 4)
CHECK (VALUE BETWEEN 0 AND 1);
-- TABLE using domains
CREATE TABLE customer_contact (
contact_id SERIAL PRIMARY KEY,
email email_address NOT NULL,
phone phone_number,
preferred_contact_method VARCHAR(10) DEFAULT 'email'
);
-- TABLE met EXCLUDE constraints voor complexe regels
CREATE TABLE room_booking (
booking_id SERIAL PRIMARY KEY,
room_id INT NOT NULL,
booking_date DATE NOT NULL,
time_slot TSRANGE NOT NULL,
-- Voorkom overlappende boekingen voor dezelfde kamer
EXCLUDE USING gist (
room_id WITH =,
time_slot WITH &&
),
CONSTRAINT chk_valid_time_slot
CHECK (
EXTRACT(HOUR FROM lower(time_slot)) BETWEEN 8 AND 18
AND EXTRACT(HOUR FROM upper(time_slot)) BETWEEN 9 AND 19
)
);
5. Indexering Strategieën
Database Index Optimalisatie
Indexering is een kritieke techniek voor het verbeteren van query performance door snelle data retrieval mogelijk te maken.
Comprehensive Indexing Strategies
-- COMPREHENSIVE INDEXING STRATEGIES VOOR POSTGRESQL
-- B-tree Index (standaard voor meeste use cases)
CREATE INDEX idx_customer_email
ON customer(email);
-- Unique index (automatisch gemaakt voor PRIMARY KEY en UNIQUE constraints)
CREATE UNIQUE INDEX idx_unique_customer_email
ON customer(LOWER(email)); -- Case-insensitive
-- Composite index voor multi-column queries
CREATE INDEX idx_customer_name_location
ON customer(last_name, first_name, city);
-- Covering index (INCLUDE columns voor index-only scans)
CREATE INDEX idx_order_customer_covering
ON "order" (customer_id, order_date)
INCLUDE (total_amount, order_status);
-- Partial index voor gefilterde subsets
CREATE INDEX idx_active_customers
ON customer(email)
WHERE is_active = TRUE
AND last_login >= CURRENT_DATE - INTERVAL '90 days';
-- Functional index op berekende waarden
CREATE INDEX idx_customer_name_lower
ON customer(LOWER(first_name), LOWER(last_name));
CREATE INDEX idx_order_year_month
ON "order" (EXTRACT(YEAR FROM order_date), EXTRACT(MONTH FROM order_date));
-- GiST (Generalized Search Tree) index voor complexe data types
CREATE INDEX idx_customer_location
ON customer USING GIST (location); -- Voor PostGIS geography/geometry
CREATE INDEX idx_product_tags
ON product USING GIN (tags); -- Voor array columns
CREATE INDEX idx_customer_metadata
ON customer USING GIN (metadata); -- Voor JSONB columns
-- BRIN (Block Range INdex) voor grote, chronologische tabellen
CREATE INDEX idx_order_date_brin
ON "order" USING BRIN (order_date);
-- Hash index voor exact match queries (PostgreSQL 10+)
CREATE INDEX idx_customer_id_hash
ON customer USING HASH (customer_id);
-- MULTI-COLUMN STRATEGY VOOR COMPLEX QUERY PATTERNS
-- Voorbeeld: E-commerce order systeem
CREATE TABLE order_index_demo (
order_id BIGSERIAL PRIMARY KEY,
customer_id BIGINT NOT NULL,
order_date TIMESTAMPTZ NOT NULL,
status VARCHAR(20) NOT NULL,
total_amount DECIMAL(12, 2) NOT NULL,
shipping_country VARCHAR(100),
payment_method VARCHAR(50)
);
-- Strategie 1: Query patterns analyseren
-- Veelgebruikte query: Orders per customer met status filter
CREATE INDEX idx_order_customer_status
ON order_index_demo (customer_id, status, order_date DESC);
-- Veelgebruikte query: Dagelijkse sales rapportage
CREATE INDEX idx_order_date_status_amount
ON order_index_demo (date_trunc('day', order_date), status)
INCLUDE (total_amount, shipping_country);
-- Strategie 2: Partial indexes voor hot data
CREATE INDEX idx_recent_pending_orders
ON order_index_demo (customer_id, order_date)
WHERE status = 'pending'
AND order_date >= CURRENT_DATE - INTERVAL '7 days';
CREATE INDEX idx_high_value_orders
ON order_index_demo (customer_id, order_date DESC)
WHERE total_amount > 1000;
-- Strategie 3: Expression indexes voor veelgebruikte filters
CREATE INDEX idx_order_year_week
ON order_index_demo (EXTRACT(YEAR FROM order_date), EXTRACT(WEEK FROM order_date));
CREATE INDEX idx_lower_shipping_country
ON order_index_demo (LOWER(shipping_country));
-- Strategie 4: Concurrent index creation voor production databases
CREATE INDEX CONCURRENTLY idx_order_payment_method
ON order_index_demo (payment_method);
-- INDEX MAINTENANCE EN MONITORING
-- Check index usage statistieken
SELECT
schemaname,
tablename,
indexname,
idx_scan AS index_scans,
idx_tup_read AS tuples_read,
idx_tup_fetch AS tuples_fetched,
pg_size_pretty(pg_relation_size(indexname::regclass)) AS index_size
FROM pg_stat_user_indexes
JOIN pg_index USING (indexrelid)
WHERE idx_scan = 0 -- Ongebruikte indexes
AND NOT indisunique -- Skip unique indexes
AND NOT indisprimary -- Skip primary key indexes
ORDER BY pg_relation_size(indexname::regclass) DESC;
-- Find duplicate indexes
SELECT
pg_size_pretty(SUM(pg_relation_size(idx))::BIGINT) AS total_size,
(array_agg(idx))[1] AS idx1,
(array_agg(idx))[2] AS idx2,
(array_agg(idx))[3] AS idx3
FROM (
SELECT
indexrelid::regclass AS idx,
indrelid::regclass AS table_name,
array_agg(attname ORDER BY attnum) AS index_attrs
FROM pg_index
JOIN pg_attribute
ON indrelid = attrelid
AND attnum = ANY(indkey)
WHERE indrelid::regclass::TEXT = 'order_index_demo'
GROUP BY indexrelid, indrelid
) sub
GROUP BY table_name, index_attrs
HAVING COUNT(*) > 1;
-- Index bloat detection
SELECT
schemaname,
tablename,
indexname,
pg_size_pretty(pg_relation_size(indexname::regclass)) AS index_size,
idx_scan AS index_scans,
idx_tup_read AS tuples_read,
idx_tup_fetch AS tuples_fetched
FROM pg_stat_user_indexes
WHERE idx_scan < 50 -- Weinig gebruikte indexes
AND pg_relation_size(indexname::regclass) > 100000000 -- > 100MB
ORDER BY pg_relation_size(indexname::regclass) DESC;
-- REINDEX voor index maintenance
REINDEX INDEX CONCURRENTLY idx_order_customer_status;
6. Database Performance Optimalisatie
Performance Tuning Technieken
Database performance optimalisatie omvat query tuning, index strategieën, hardware optimalisatie en database configuratie.
Comprehensive Performance Optimization
-- DATABASE PERFORMANCE OPTIMALISATIE TECHNIQUEN
-- 1. QUERY OPTIMALISATIE
-- Gebruik EXPLAIN ANALYZE voor query analysis
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT
c.customer_id,
c.first_name,
c.last_name,
COUNT(o.order_id) AS total_orders,
SUM(o.total_amount) AS total_spent,
AVG(o.total_amount) AS avg_order_value
FROM customer c
JOIN "order" o ON c.customer_id = o.customer_id
WHERE c.registration_date >= '2024-01-01'
AND o.order_status = 'completed'
AND o.total_amount > 100
GROUP BY c.customer_id, c.first_name, c.last_name
HAVING COUNT(o.order_id) >= 5
ORDER BY total_spent DESC
LIMIT 100;
-- 2. QUERY REWRITING VOOR PERFORMANCE
-- Originele inefficiente query
SELECT DISTINCT c.*
FROM customer c
WHERE c.customer_id IN (
SELECT o.customer_id
FROM "order" o
WHERE o.total_amount > 1000
AND o.order_date >= CURRENT_DATE - INTERVAL '30 days'
);
-- Geoptimaliseerde versie met EXISTS
SELECT c.*
FROM customer c
WHERE EXISTS (
SELECT 1
FROM "order" o
WHERE o.customer_id = c.customer_id
AND o.total_amount > 1000
AND o.order_date >= CURRENT_DATE - INTERVAL '30 days'
);
-- 3. MATERIALIZED VIEWS VOOR COMPLEXE AGGREGATIES
CREATE MATERIALIZED VIEW mv_daily_sales_summary
AS
SELECT
DATE_TRUNC('day', o.order_date) AS sales_day,
c.country,
p.category_id,
COUNT(DISTINCT o.order_id) AS order_count,
COUNT(oi.order_item_id) AS item_count,
SUM(oi.quantity) AS total_quantity,
SUM(oi.quantity * oi.unit_price) AS total_revenue,
AVG(oi.quantity * oi.unit_price) AS avg_order_value
FROM "order" o
JOIN customer c ON o.customer_id = c.customer_id
JOIN order_item oi ON o.order_id = oi.order_id
JOIN product p ON oi.product_id = p.product_id
WHERE o.order_status = 'completed'
GROUP BY DATE_TRUNC('day', o.order_date), c.country, p.category_id
WITH DATA;
-- Index op materialized view
CREATE UNIQUE INDEX idx_mv_daily_sales
ON mv_daily_sales_summary (sales_day, country, category_id);
-- Refresh strategie
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_daily_sales_summary;
-- 4. PARTITIONERING VOOR GROTE TABELLEN
-- Range partitionering op order_date
CREATE TABLE order_partitioned (
LIKE "order" INCLUDING ALL
) PARTITION BY RANGE (order_date);
-- Partities per kwartaal
CREATE TABLE orders_2024_q1
PARTITION OF order_partitioned
FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');
CREATE TABLE orders_2024_q2
PARTITION OF order_partitioned
FOR VALUES FROM ('2024-04-01') TO ('2024-07-01');
-- Index per partition
CREATE INDEX idx_orders_2024_q1_customer
ON orders_2024_q1 (customer_id, order_date);
-- 5. CONNECTION POOLING CONFIGURATIE
-- pgBouncer configuratie voorbeeld
/*
[databases]
mydb = host=localhost dbname=mydb
[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 20
*/
-- 6. VACUUM EN ANALYZE OPTIMALISATIE
VACUUM (VERBOSE, ANALYZE) "order";
-- Auto-vacuum configuratie
ALTER TABLE "order" SET (
autovacuum_vacuum_scale_factor = 0.05,
autovacuum_analyze_scale_factor = 0.02,
autovacuum_vacuum_cost_delay = 10
);
-- 7. QUERY CACHING STRATEGIEËN
-- Prepared statements voor herhaalde queries
PREPARE get_customer_orders (INT, DATE) AS
SELECT *
FROM "order"
WHERE customer_id = $1
AND order_date >= $2
AND order_status = 'completed';
-- Gebruik prepared statement
EXECUTE get_customer_orders(12345, '2024-01-01');
-- 8. DATABASE CONFIGURATIE OPTIMALISATIE
-- PostgreSQL postgresql.conf optimalisaties
/*
shared_buffers = 4GB # 25% van RAM voor dedicated server
effective_cache_size = 12GB # 75% van RAM
work_mem = 64MB # Per operation memory
maintenance_work_mem = 1GB # Voor VACUUM, CREATE INDEX, etc.
max_parallel_workers_per_gather = 4 # Parallel query execution
max_worker_processes = 8 # Aantal background processes
wal_buffers = 16MB # Write Ahead Log buffers
checkpoint_timeout = 15min # Checkpoint interval
max_wal_size = 4GB # Maximale WAL size
min_wal_size = 1GB # Minimale WAL size
*/
-- 9. MONITORING EN ALERTING
-- Slow query logging
ALTER SYSTEM SET log_min_duration_statement = '1000'; -- Log queries > 1s
-- Query monitoring view
CREATE VIEW active_queries AS
SELECT
pid,
usename,
application_name,
client_addr,
state,
query_start,
now() - query_start AS duration,
query
FROM pg_stat_activity
WHERE state != 'idle'
AND query IS NOT NULL
ORDER BY duration DESC;
Klaar voor Database Optimalisatie?
Vind de juiste database experts of plaats je database vacature
7. Schaalbaarheid en High Availability
Scalable Database Architecture
Schaalbaarheid en high availability zijn essentieel voor moderne database systemen die groeiende workloads en zero-downtime requirements moeten ondersteunen.
Horizontale Schaling
Sharding, partitioning en distributed databases voor lineaire schaalbaarheid
- Sharding: Data distributie over meerdere servers
- Partitionering: Data splitsing binnen één server
- Read Replicas: Scalable read performance
Verticale Schaling
Hardware upgrades voor individuele server performance
- CPU/Memory: Meer cores en RAM
- Storage: SSD/ NVMe voor IOPS
- Network: Hogere bandbreedte
High Availability
Zero-downtime architecturen voor kritieke systemen
- Replication: Synchrone/asynchrone data kopieën
- Failover: Automatische recovery
- Backup: Point-in-time recovery
Load Balancing
Workload distributie voor optimale performance
- Connection Pooling: Efficient resource gebruik
- Query Routing: Intelligente request distributie
- Caching: Reduced database load
8. NoSQL Database Design
NoSQL Data Modeling
NoSQL database design vereist verschillende patronen en technieken dan relationele databases, met focus op schaalbaarheid en flexibiliteit.
MongoDB Document Database Design
// MONGODB DOCUMENT DATABASE DESIGN PATTERNS
// 1. EMBEDDED DOCUMENT PATTERN - Voor één-op-veel relaties met gedeelde lifecycle
// Product met embedded reviews
{
"_id": ObjectId("5f8d0d55b54764421b7156c5"),
"product_code": "PROD-001",
"name": "Wireless Headphones",
"price": 129.99,
"category": "Electronics",
"specifications": {
"brand": "AudioTech",
"battery_life": "30 hours",
"connectivity": ["Bluetooth 5.0", "3.5mm jack"],
"noise_cancellation": true
},
"reviews": [ // Embedded array van reviews
{
"review_id": "REV-001",
"customer_id": "CUST-123",
"rating": 5,
"comment": "Excellent sound quality!",
"date": ISODate("2024-01-15"),
"verified_purchase": true
},
{
"review_id": "REV-002",
"customer_id": "CUST-456",
"rating": 4,
"comment": "Good battery life, comfortable fit",
"date": ISODate("2024-01-20")
}
],
"inventory": {
"stock": 45,
"warehouse": "AMS-01",
"last_restocked": ISODate("2024-01-10")
},
"metadata": {
"created_at": ISODate("2024-01-01"),
"updated_at": ISODate("2024-01-25"),
"version": 3
}
}
// 2. REFERENCE PATTERN - Voor veel-op-veel of grote datasets
// Order document met references
{
"_id": ObjectId("5f8d0d55b54764421b7156d1"),
"order_number": "ORD-2024-001234",
"customer_id": ObjectId("5f8d0d55b54764421b7156a1"), // Reference naar customer
"order_date": ISODate("2024-01-25T10:30:00Z"),
"status": "processing",
"shipping_address": {
"street": "Kalverstraat 123",
"city": "Amsterdam",
"postal_code": "1012 AB",
"country": "Netherlands"
},
"items": [ // Array van references met embedded quantity/price
{
"product_id": ObjectId("5f8d0d55b54764421b7156c5"),
"quantity": 1,
"unit_price": 129.99,
"discount": 0.1
},
{
"product_id": ObjectId("5f8d0d55b54764421b7156c6"),
"quantity": 2,
"unit_price": 24.99
}
],
"payment": {
"method": "credit_card",
"transaction_id": "TX-789012",
"amount": 179.97,
"currency": "EUR"
}
}
// 3. EXTENDED REFERENCE PATTERN - Met embedded meestgebruikte data
// Order document met extended reference naar product
{
"_id": ObjectId("5f8d0d55b54764421b7156d2"),
"order_number": "ORD-2024-001235",
"customer": { // Embedded customer snapshot
"customer_id": ObjectId("5f8d0d55b54764421b7156a2"),
"name": "Jan de Vries",
"email": "jan@example.com"
},
"items": [
{
"product_id": ObjectId("5f8d0d55b54764421b7156c5"),
"product_name": "Wireless Headphones", // Embedded voor display
"product_category": "Electronics", // Embedded voor filtering
"quantity": 1,
"unit_price": 129.99
}
]
}
// 4. BUCKET PATTERN - Voor time-series of IoT data
// Sensor data bucketed per uur
{
"_id": ObjectId("5f8d0d55b54764421b7156e1"),
"sensor_id": "TEMP-001",
"location": "Server Room A",
"bucket_start": ISODate("2024-01-25T00:00:00Z"),
"bucket_end": ISODate("2024-01-25T01:00:00Z"),
"measurements": [ // 60 metingen (1 per minuut)
{
"timestamp": ISODate("2024-01-25T00:00:00Z"),
"temperature": 22.5,
"humidity": 45.2
},
{
"timestamp": ISODate("2024-01-25T00:01:00Z"),
"temperature": 22.6,
"humidity": 45.1
}
// ... 58 more measurements
],
"summary": {
"avg_temperature": 22.7,
"max_temperature": 23.1,
"min_temperature": 22.3,
"total_measurements": 60
}
}
// 5. SCHEMA VERSIONING PATTERN - Voor evoluerende schema's
{
"_id": ObjectId("5f8d0d55b54764421b7156f1"),
"entity_type": "customer",
"schema_version": 2,
"data": {
"customer_id": "CUST-001",
"name": "Maria Sanchez",
"contact": { // Nieuwe structuur in v2
"email": "maria@example.com",
"phone": "+31612345678",
"preferences": {
"newsletter": true,
"language": "nl"
}
}
},
"metadata": {
"created_at": ISODate("2024-01-01"),
"updated_at": ISODate("2024-01-25"),
"migrated_from_version": 1
}
}
// 6. COMPOUND KEY PATTERN - Voor natuurlijke primary keys
{
"_id": {
"tenant_id": "TENANT-A",
"user_id": "USER-001"
},
"email": "user001@tenant-a.com",
"role": "admin",
"settings": {
"theme": "dark",
"notifications": true
}
}
// MONGODB INDEXING STRATEGIEËN
// Single field index
db.products.createIndex({ "product_code": 1 });
// Compound index
db.orders.createIndex({
"customer_id": 1,
"order_date": -1
});
// Multikey index voor arrays
db.products.createIndex({ "specifications.connectivity": 1 });
// Text index voor full-text search
db.products.createIndex({
"name": "text",
"description": "text"
});
// Geospatial index
db.stores.createIndex({ "location": "2dsphere" });
// Partial index
db.customers.createIndex(
{ "email": 1 },
{ "partialFilterExpression": { "is_active": true } }
);
// TTL index voor automatische data expiration
db.sessions.createIndex(
{ "last_accessed": 1 },
{ "expireAfterSeconds": 86400 } // 24 uur
);
9. Database Security Best Practices
Database Security Framework
Database security omvat authenticatie, autorisatie, encryptie, auditing en compliance maatregelen om data te beschermen tegen onbevoegde toegang.
Access Control
- Role-Based Access Control (RBAC)
- Principle of Least Privilege
- Multi-Factor Authentication
- Network Segmentation
Data Encryption
- Encryption at Rest (TDE)
- Encryption in Transit (SSL/TLS)
- Column-level Encryption
- Key Management
Auditing & Monitoring
- Database Activity Monitoring
- Change Data Capture
- SIEM Integration
- Compliance Reporting
Vulnerability Management
- Regular Security Patching
- Database Hardening
- SQL Injection Prevention
- Penetration Testing
10. Database Migration Strategieën
Migration Planning en Execution
Database migration vereist zorgvuldige planning, testing en execution om data integriteit en business continuity te garanderen tijdens platform veranderingen.
Database Migration Methodologie
-- DATABASE MIGRATION STRATEGIEËN EN BEST PRACTICES
-- 1. MIGRATIE METHODOLOGIEËN
-- Big Bang Migration: Alles in één keer
CREATE DATABASE new_production
WITH TEMPLATE old_production;
-- Phased Migration: Gefaseerde overgang
-- Stap 1: Migreer referentie tabellen
INSERT INTO new_schema.countries
SELECT * FROM old_schema.countries;
-- Stap 2: Migreer master data
INSERT INTO new_schema.customers
SELECT * FROM old_schema.customers;
-- Stap 3: Migreer transactionele data
INSERT INTO new_schema.orders
SELECT * FROM old_schema.orders
WHERE order_date >= '2024-01-01';
-- 2. DATA VALIDATION NA MIGRATIE
-- Record count validation
SELECT
'customers' AS table_name,
(SELECT COUNT(*) FROM old_schema.customers) AS old_count,
(SELECT COUNT(*) FROM new_schema.customers) AS new_count,
CASE
WHEN (SELECT COUNT(*) FROM old_schema.customers) =
(SELECT COUNT(*) FROM new_schema.customers)
THEN 'OK'
ELSE 'MISMATCH'
END AS validation_status
UNION ALL
SELECT
'orders',
(SELECT COUNT(*) FROM old_schema.orders),
(SELECT COUNT(*) FROM new_schema.orders),
CASE
WHEN (SELECT COUNT(*) FROM old_schema.orders) =
(SELECT COUNT(*) FROM new_schema.orders)
THEN 'OK'
ELSE 'MISMATCH'
END;
-- Data integrity validation
SELECT
'Data Integrity Check' AS check_type,
COUNT(*) AS mismatches
FROM (
SELECT customer_id, SUM(total_amount) AS old_total
FROM old_schema.orders
GROUP BY customer_id
) old_totals
FULL OUTER JOIN (
SELECT customer_id, SUM(total_amount) AS new_total
FROM new_schema.orders
GROUP BY customer_id
) new_totals ON old_totals.customer_id = new_totals.customer_id
WHERE old_totals.old_total != new_totals.new_total
OR (old_totals.customer_id IS NULL AND new_totals.customer_id IS NOT NULL)
OR (old_totals.customer_id IS NOT NULL AND new_totals.customer_id IS NULL);
-- 3. MIGRATIE TOOLING EN AUTOMATISERING
-- Flyway migration script voorbeeld
-- V1__initial_schema.sql
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL
);
-- V2__add_order_table.sql
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT REFERENCES customers(customer_id),
order_date DATE NOT NULL
);
-- 4. ROLLBACK STRATEGIEËN
-- Database backup voor rollback
CREATE DATABASE production_backup_20240125
WITH TEMPLATE production;
-- Point-in-time recovery planning
SELECT pg_start_backup('migration_backup');
-- Voer migratie uit
SELECT pg_stop_backup();
-- 5. PERFORMANCE VALIDATION
-- Query performance comparison
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM new_schema.customers
WHERE customer_id = 12345;
-- Compare with old system
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM old_schema.customers
WHERE customer_id = 12345;
11. Database Design Tools
Modern Database Tooling
Database design tools helpen bij het visualiseren, ontwerpen, documenteren en implementeren van database structuren.
| Tool | Type | Voordelen | Prijs | Best Voor |
|---|---|---|---|---|
| MySQL Workbench | Visual Design Tool | Gratis, ERD modeling, SQL development | Free | MySQL/MariaDB ontwikkeling |
| pgAdmin | Database Management | Gratis, PostgreSQL specifiek, query tool | Free | PostgreSQL beheer |
| Lucidchart | Online ERD Tool | Collaboration, templates, integrations | Freemium | Team database design |
| DbVisualizer | Universal Database Tool | Multi-database support, visual query builder | Freemium | Mixed database environments |
| DataGrip | IDE voor Databases | Code completion, refactoring, version control | Paid | Professional developers |
| dbdiagram.io | Online ERD Tool | Simple syntax, sharing, export options | Freemium | Quick prototyping |
12. Praktijk Case Studies
Real-World Database Design
Praktijk case studies tonen hoe effectief database design wordt geïmplementeerd in verschillende industrieën en use cases.
Case Study: High-Traffic E-commerce Platform
Uitdaging: 10K+ transacties per minuut, real-time inventory, persoonlijke aanbevelingen.
Oplossing: Polyglot persistence met microservices architecture.
| Data Type | Database | Pattern | Performance |
|---|---|---|---|
| Product Catalog | MongoDB | Document per product | 5ms read latency |
| Orders | PostgreSQL | Normalized tables | ACID compliance |
| Shopping Cart | Redis | Key-Value store | < 1ms response |
| Recommendations | Neo4j | Graph database | Real-time queries |
| Analytics | ClickHouse | Columnar storage | Sub-second queries |
Resultaten:
- 99.99% uptime tijdens Black Friday
- 50% reductie in query response time
- 30% minder database costs
- Zero data loss bij failover
Case Study: Healthcare EHR System
Uitdaging: GDPR/ HIPAA compliance, audit trails, real-time patient data.
Oplossing: Temporal database design met data vault modeling.
Design Principles
- Immutable audit trails
- Patient data versioning
- Role-based access control
- Data lineage tracking
Technical Implementation
- PostgreSQL met temporal tables
- Column-level encryption
- Change Data Capture
- Automated compliance reporting
Business Outcomes
- 100% compliance audit passing
- 30% snellere data retrieval
- Zero security incidents
- Improved patient outcomes
Conclusie en Key Takeaways
Database Design Best Practices
DO's
- Begin met requirements analysis
- Normaliseer tot 3NF (tenzij performance nodig)
- Documenteer schema en relaties
- Plan voor groei en schaalbaarheid
- Implementeer security vanaf het begin
DON'Ts
- Over-normaliseer niet voor OLAP workloads
- Negeer index maintenance niet
- Vergeet backup en recovery planning niet
- Onderschat performance testing niet
- Negeer compliance requirements niet
Emerging Trends
- Serverless databases
- AI-driven query optimization
- Blockchain voor data integrity
- Quantum-safe encryption
- Edge database computing
Database Design Checklist
- Requirements gathering en analysis
- ERD design en validatie
- Normalisatie tot geschikte vorm
- Performance requirement analysis
- Security en compliance planning
- Migration en deployment strategie
- Monitoring en maintenance plan
- Documentation en knowledge transfer
Veelgestelde Vragen (FAQ)
Q: Wanneer moet ik denormaliseren voor performance?
A: Denormaliseer wanneer: 1) Read performance kritiek is, 2) Joins te complex/duur zijn, 3) Data weinig verandert, 4) Reporting/analytics workloads. Altijd trade-off tussen performance en data integriteit evalueren.
Q: Hoe kies ik tussen SQL en NoSQL?
A: Kies SQL voor: ACID compliance, complexe queries, structured data. Kies NoSQL voor: schaalbaarheid, flexibele schema's, grote volumes unstructured data. Overweeg polyglot persistence voor complexe systemen.
Q: Wat zijn de belangrijkste database performance metrics?
A: Belangrijkste metrics: 1) Query response time, 2) Throughput (TPS/QPS), 3) Connection pool utilization, 4) Cache hit ratio, 5) Disk I/O latency, 6) CPU/memory utilization, 7) Lock wait time, 8) Replication lag.
Q: Hoe implementeer ik database sharding?
A: Sharding implementatie: 1) Kies sharding key (bijv. customer_id), 2) Selecteer sharding strategie (range, hash, directory), 3) Implementeer routing layer, 4) Plan voor cross-shard queries, 5) Implementeer rebalancing mechanism, 6) Test failover en recovery.