SQL Window Functions: Complete Gids voor Geavanceerde Analytics
Leer SQL Window Functions voor krachtige data analyse zonder self-joins. Praktische gids met voorbeelden voor ranking, running totals, time-series analyse en complexe aggregaties.
Zoek je SQL Analytics Experts?
Vind ervaren SQL Developers gespecialiseerd in geavanceerde analytics en window functions
Inhoudsopgave
- Inleiding tot Window Functions
- Basic Syntax en Concepts
- Ranking Functions (ROW_NUMBER, RANK, DENSE_RANK)
- Analytic Functions (LEAD, LAG, FIRST_VALUE, LAST_VALUE)
- Aggregate Window Functions (SUM, AVG, COUNT OVER)
- Window Frame Specificatie (ROWS vs RANGE)
- PARTITION BY Deep Dive
- Geavanceerde Patterns en Use Cases
- Performance Optimalisatie
- Real-World Business Examples
- Window Functions vs GROUP BY vs Self-Joins
1. Inleiding tot Window Functions
Wat zijn Window Functions?
SQL Window Functions (ook wel Analytic Functions genoemd) zijn krachtige SQL functies die berekeningen uitvoeren over een set rijen die gerelateerd zijn aan de huidige rij, zonder de resultaten te groeperen tot een enkele output rij.
In tegenstelling tot aggregatie met GROUP BY, behouden window functions de individuele rijen terwijl ze berekeningen uitvoeren over gerelateerde rijen.
Ranking Functions
ROW_NUMBER, RANK, DENSE_RANK, NTILE - voor rangschikking en percentielen
Analytic Functions
LEAD, LAG, FIRST_VALUE, LAST_VALUE - voor time-series en sequential analyse
Aggregate Functions
SUM, AVG, COUNT, MIN, MAX OVER - voor running totals en moving averages
Statistical Functions
PERCENT_RANK, CUME_DIST, STDDEV, VARIANCE - voor statistische analyse
| Feature | Window Functions | GROUP BY Aggregatie | Self-Joins |
|---|---|---|---|
| Output Rijen | Behoudt alle input rijen | Groepen tot één rij per groep | Behoudt alle rijen, maar complex |
| Performance | Optimaal, één tabel scan | Goed voor aggregaties | Slecht, multiple scans/joins |
| Complexiteit | Medium (nieuwe syntax) | Laag (bekende syntax) | Hoog (complexe joins) |
| Use Case | Running totals, ranking, gaps | Totaal per groep | Historische vergelijkingen |
| Readability | Hoog (declaratief) | Hoog (simpel) | Laag (proceduraal) |
2. Basic Syntax en Concepts
Window Function Syntax
De basis syntax van window functions bestaat uit drie hoofdcomponenten: de window function zelf, PARTITION BY clause, en ORDER BY clause.
Basic Window Function Structuur
-- BASIS SYNTAX
SELECT
column1,
column2,
window_function(column3) OVER (
[PARTITION BY partition_expression]
[ORDER BY order_expression [ASC | DESC]]
[window_frame_clause]
) AS alias_name
FROM table_name
[WHERE conditions];
-- EENVOUDIG VOORBEELD: Running total zonder PARTITION BY
SELECT
order_id,
order_date,
total_amount,
SUM(total_amount) OVER (
ORDER BY order_date
) AS running_total
FROM orders
ORDER BY order_date;
-- MET PARTITION BY: Running total per klant
SELECT
customer_id,
order_date,
total_amount,
SUM(total_amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
) AS customer_running_total
FROM orders
ORDER BY customer_id, order_date;
-- MET WINDOW FRAME: Moving average laatste 3 orders
SELECT
customer_id,
order_date,
total_amount,
AVG(total_amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS moving_avg_3_orders
FROM orders
ORDER BY customer_id, order_date;
Window Function Components
-- COMPONENTEN VAN EEN WINDOW FUNCTION
-- 1. WINDOW FUNCTION TYPE
-- Ranking: ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE()
-- Analytic: LEAD(), LAG(), FIRST_VALUE(), LAST_VALUE(), NTH_VALUE()
-- Aggregate: SUM(), AVG(), COUNT(), MIN(), MAX()
-- Statistical: STDDEV(), VARIANCE(), PERCENT_RANK(), CUME_DIST()
-- 2. OVER() CLAUSE - Markeert het als window function
-- Zonder PARTITION BY: Hele result set is één window
-- Met PARTITION BY: Aparte windows per partition
-- 3. PARTITION BY - Verdeelt data in groepen (vergelijkbaar met GROUP BY)
-- Syntax: PARTITION BY column1, column2, ...
-- Elk partition heeft zijn eigen window
-- 4. ORDER BY - Bepaalt volgorde binnen window
-- Syntax: ORDER BY column [ASC|DESC], ...
-- Essentieel voor ranking en running calculations
-- 5. WINDOW FRAME - Specificeert welke rijen in het window
-- Syntax: ROWS|RANGE BETWEEN start AND end
-- Options: UNBOUNDED PRECEDING, n PRECEDING, CURRENT ROW, n FOLLOWING, UNBOUNDED FOLLOWING
-- COMPLETE VOORBEELD MET ALLE COMPONENTEN
SELECT
employee_id,
department_id,
salary,
hire_date,
-- Ranking binnen department
ROW_NUMBER() OVER (
PARTITION BY department_id
ORDER BY salary DESC
) AS salary_rank_in_dept,
-- Percentage van department total
salary * 100.0 / SUM(salary) OVER (
PARTITION BY department_id
) AS pct_of_dept_salary,
-- Verschil met hoogste salaris in department
MAX(salary) OVER (
PARTITION BY department_id
) - salary AS diff_from_max_in_dept,
-- Vorige salaris in department (op hire date)
LAG(salary, 1) OVER (
PARTITION BY department_id
ORDER BY hire_date
) AS previous_salary_in_dept,
-- Cumulative salary in department
SUM(salary) OVER (
PARTITION BY department_id
ORDER BY hire_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS cumulative_salary_in_dept
FROM employees
WHERE hire_date >= '2020-01-01'
ORDER BY department_id, salary DESC;
3. Ranking Functions (ROW_NUMBER, RANK, DENSE_RANK)
Ranking en Ordering Functions
Ranking functions wijzen rangnummers toe aan rijen binnen een window, gebaseerd op de ORDER BY clause. Ze zijn essentieel voor top-N queries, pagination en competitieve analyses.
ROW_NUMBER, RANK, DENSE_RANK Vergelijking
-- VOORBEELD DATA: Employee salaries
-- Stel we hebben deze data:
-- Employee | Department | Salary
-- ---------|------------|--------
-- Alice | Sales | 5000
-- Bob | Sales | 5000
-- Charlie | Sales | 4500
-- David | Sales | 4000
-- Eve | IT | 6000
-- Frank | IT | 5500
SELECT
employee_name,
department,
salary,
-- ROW_NUMBER: Unieke nummers, zelfs bij gelijke salarissen
ROW_NUMBER() OVER (
PARTITION BY department
ORDER BY salary DESC
) AS row_num,
-- RANK: Gelijke rank bij gelijke salarissen, gaps bij ties
RANK() OVER (
PARTITION BY department
ORDER BY salary DESC
) AS rank,
-- DENSE_RANK: Gelijke rank bij gelijke salarissen, geen gaps
DENSE_RANK() OVER (
PARTITION BY department
ORDER BY salary DESC
) AS dense_rank,
-- NTILE: Verdeel in groepen (bijv. quartiles, percentiles)
NTILE(4) OVER (
PARTITION BY department
ORDER BY salary DESC
) AS quartile
FROM employees
ORDER BY department, salary DESC;
-- RESULTAAT:
-- Employee | Department | Salary | row_num | rank | dense_rank | quartile
-- ---------|------------|--------|---------|------|------------|---------
-- Alice | Sales | 5000 | 1 | 1 | 1 | 1
-- Bob | Sales | 5000 | 2 | 1 | 1 | 1
-- Charlie | Sales | 4500 | 3 | 3 | 2 | 2
-- David | Sales | 4000 | 4 | 4 | 3 | 2
-- Eve | IT | 6000 | 1 | 1 | 1 | 1
-- Frank | IT | 5500 | 2 | 2 | 2 | 2
-- PRAKTISCHE USE CASE: Top 3 medewerkers per department
WITH RankedEmployees AS (
SELECT
employee_name,
department,
salary,
ROW_NUMBER() OVER (
PARTITION BY department
ORDER BY salary DESC
) AS salary_rank
FROM employees
)
SELECT *
FROM RankedEmployees
WHERE salary_rank <= 3
ORDER BY department, salary_rank;
-- PAGINATION: Fetch rows 11-20 (page 2 met 10 rows per page)
WITH PaginatedData AS (
SELECT
*,
ROW_NUMBER() OVER (ORDER BY order_date DESC) AS row_num
FROM orders
WHERE order_date >= '2024-01-01'
)
SELECT *
FROM PaginatedData
WHERE row_num BETWEEN 11 AND 20;
Geavanceerde Ranking Use Cases
-- USE CASE 1: Find duplicate records (based on business key)
WITH DuplicateCheck AS (
SELECT
customer_email,
customer_name,
registration_date,
ROW_NUMBER() OVER (
PARTITION BY customer_email
ORDER BY registration_date
) AS duplicate_num
FROM customers
)
SELECT *
FROM DuplicateCheck
WHERE duplicate_num > 1 -- All duplicates except first occurrence
ORDER BY customer_email, duplicate_num;
-- USE CASE 2: Gap analysis in sequences
WITH NumberedOrders AS (
SELECT
order_id,
order_date,
ROW_NUMBER() OVER (ORDER BY order_date) AS sequence_num,
ROW_NUMBER() OVER (ORDER BY order_id) AS expected_sequence
FROM orders
WHERE YEAR(order_date) = 2024
)
SELECT
order_id,
order_date,
sequence_num,
expected_sequence,
expected_sequence - sequence_num AS gap_size
FROM NumberedOrders
WHERE expected_sequence - sequence_num > 0 -- Find gaps
ORDER BY gap_size DESC;
-- USE CASE 3: Percentile analysis with NTILE
SELECT
customer_id,
total_purchase_amount,
-- Verdeel in 10 groepen (deciles)
NTILE(10) OVER (ORDER BY total_purchase_amount DESC) AS decile,
-- Verdeel in 4 groepen (quartiles)
NTILE(4) OVER (ORDER BY total_purchase_amount DESC) AS quartile,
-- Verdeel in 100 groepen (percentiles)
NTILE(100) OVER (ORDER BY total_purchase_amount DESC) AS percentile,
-- PERCENT_RANK: Relative rank (0-1)
PERCENT_RANK() OVER (
ORDER BY total_purchase_amount DESC
) AS percent_rank,
-- CUME_DIST: Cumulative distribution (0-1)
CUME_DIST() OVER (
ORDER BY total_purchase_amount DESC
) AS cumulative_distribution
FROM customer_purchases
ORDER BY total_purchase_amount DESC;
-- USE CASE 4: Remove middle duplicates, keep first and last
WITH DuplicateProcessing AS (
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY business_key
ORDER BY created_at
) AS rn_asc,
ROW_NUMBER() OVER (
PARTITION BY business_key
ORDER BY created_at DESC
) AS rn_desc
FROM your_table
)
SELECT *
FROM DuplicateProcessing
WHERE rn_asc = 1 -- Keep first
OR rn_desc = 1; -- Keep last
SQL Analytics Experts Nodig?
Vind ervaren SQL Developers gespecialiseerd in geavanceerde analytics en window functions
4. Analytic Functions (LEAD, LAG, FIRST_VALUE, LAST_VALUE)
Time-Series en Sequential Analysis
Analytic functions geven toegang tot andere rijen in het window zonder self-joins. Ze zijn essentieel voor time-series analyse, sequential comparisons en gap detection.
LEAD en LAG Functions
-- LEAD: Kijk naar volgende rij
-- LAG: Kijk naar vorige rij
-- BASIS SYNTAX:
-- LEAD(column, offset, default_value) OVER (...)
-- LAG(column, offset, default_value) OVER (...)
-- PRAKTISCH VOORBEELD: Maand-over-maand sales analyse
SELECT
year_month,
total_sales,
-- Vorige maand
LAG(total_sales, 1) OVER (ORDER BY year_month) AS prev_month_sales,
-- Volgende maand
LEAD(total_sales, 1) OVER (ORDER BY year_month) AS next_month_sales,
-- Verschil met vorige maand
total_sales - LAG(total_sales, 1) OVER (ORDER BY year_month)
AS sales_diff_prev_month,
-- Percentage verandering t.o.v. vorige maand
ROUND(
(total_sales - LAG(total_sales, 1) OVER (ORDER BY year_month)) * 100.0 /
LAG(total_sales, 1) OVER (ORDER BY year_month),
2
) AS pct_change_prev_month,
-- 3-maand moving average
AVG(total_sales) OVER (
ORDER BY year_month
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS moving_avg_3_month
FROM monthly_sales
ORDER BY year_month;
-- COMPLEX VOORBEELD: Customer order sequence analysis
SELECT
customer_id,
order_id,
order_date,
order_amount,
-- Vorige order van deze klant
LAG(order_date, 1) OVER (
PARTITION BY customer_id
ORDER BY order_date
) AS prev_order_date,
-- Volgende order van deze klant
LEAD(order_date, 1) OVER (
PARTITION BY customer_id
ORDER BY order_date
) AS next_order_date,
-- Dagen sinds vorige order
DATEDIFF(
day,
LAG(order_date, 1) OVER (
PARTITION BY customer_id
ORDER BY order_date
),
order_date
) AS days_since_prev_order,
-- Running total per klant
SUM(order_amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
) AS customer_running_total,
-- Gemiddelde order amount van klant tot nu toe
AVG(order_amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
) AS customer_avg_to_date
FROM orders
WHERE order_date >= '2024-01-01'
ORDER BY customer_id, order_date;
-- MULTIPLE OFFSETS: Kijk meerdere rijen vooruit/achteruit
SELECT
date,
sales,
-- Vorige dag
LAG(sales, 1, 0) OVER (ORDER BY date) AS prev_day,
-- 7 dagen geleden
LAG(sales, 7, 0) OVER (ORDER BY date) AS week_ago,
-- Week-over-week verschil
sales - LAG(sales, 7, 0) OVER (ORDER BY date) AS wow_difference,
-- 30 dagen geleden (maandoverzicht)
LAG(sales, 30, 0) OVER (ORDER BY date) AS month_ago
FROM daily_sales
ORDER BY date;
FIRST_VALUE, LAST_VALUE en NTH_VALUE
-- FIRST_VALUE: Eerste waarde in het window
-- LAST_VALUE: Laatste waarde in het window
-- NTH_VALUE: N-de waarde in het window
-- PRAKTISCH VOORBEELD: Product price analysis
SELECT
product_id,
price_date,
price,
-- Eerste prijs in de reeks
FIRST_VALUE(price) OVER (
PARTITION BY product_id
ORDER BY price_date
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS first_price,
-- Laatste prijs in de reeks
LAST_VALUE(price) OVER (
PARTITION BY product_id
ORDER BY price_date
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS last_price,
-- Hoogste prijs tot nu toe
MAX(price) OVER (
PARTITION BY product_id
ORDER BY price_date
) AS max_price_to_date,
-- Laagste prijs tot nu toe
MIN(price) OVER (
PARTITION BY product_id
ORDER BY price_date
) AS min_price_to_date,
-- Verschil met startprijs
price - FIRST_VALUE(price) OVER (
PARTITION BY product_id
ORDER BY price_date
) AS diff_from_first,
-- Percentage verandering vanaf start
ROUND(
(price - FIRST_VALUE(price) OVER (
PARTITION BY product_id
ORDER BY price_date
)) * 100.0 /
FIRST_VALUE(price) OVER (
PARTITION BY product_id
ORDER BY price_date
),
2
) AS pct_change_from_first
FROM product_prices
ORDER BY product_id, price_date;
-- NTH_VALUE: Find specific position in window
SELECT
employee_id,
department_id,
salary,
-- Tweede hoogste salaris in department
NTH_VALUE(salary, 2) OVER (
PARTITION BY department_id
ORDER BY salary DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS second_highest_salary,
-- Derde hoogste salaris in department
NTH_VALUE(salary, 3) OVER (
PARTITION BY department_id
ORDER BY salary DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS third_highest_salary,
-- Verschil met tweede hoogste
salary - NTH_VALUE(salary, 2) OVER (
PARTITION BY department_id
ORDER BY salary DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS diff_from_second_highest
FROM employees
ORDER BY department_id, salary DESC;
-- COMBINATIE VAN FUNCTIES: Complex business scenario
SELECT
customer_id,
order_date,
order_amount,
-- Eerste order amount
FIRST_VALUE(order_amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
) AS first_order_amount,
-- Meest recente order amount
LAST_VALUE(order_amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS latest_order_amount,
-- Gemiddelde van eerste en laatste order
(
FIRST_VALUE(order_amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
) +
LAST_VALUE(order_amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
) / 2.0 AS avg_first_last,
-- Trend: stijgend/dalend t.o.v. eerste order
CASE
WHEN order_amount > FIRST_VALUE(order_amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
) THEN 'Increasing'
WHEN order_amount < FIRST_VALUE(order_amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
) THEN 'Decreasing'
ELSE 'Stable'
END AS trend_vs_first_order
FROM orders
ORDER BY customer_id, order_date;
5. Aggregate Window Functions (SUM, AVG, COUNT OVER)
Running Calculations en Cumulative Aggregates
Aggregate window functions laten je aggregaties uitvoeren over een window terwijl individuele rijen behouden blijven. Perfect voor running totals, moving averages en cumulative distributions.
SUM, AVG, COUNT OVER Examples
-- BASIS AGGREGATE WINDOW FUNCTIONS
-- 1. RUNNING TOTAL (Cumulative sum)
SELECT
order_date,
daily_sales,
-- Running total (default: ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
SUM(daily_sales) OVER (
ORDER BY order_date
) AS running_total,
-- Running total met PARTITION BY (per jaar)
SUM(daily_sales) OVER (
PARTITION BY YEAR(order_date)
ORDER BY order_date
) AS ytd_running_total
FROM daily_sales
ORDER BY order_date;
-- 2. MOVING AVERAGE
SELECT
date,
sales,
-- 7-day moving average
AVG(sales) OVER (
ORDER BY date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS moving_avg_7d,
-- 30-day moving average
AVG(sales) OVER (
ORDER BY date
ROWS BETWEEN 29 PRECEDING AND CURRENT ROW
) AS moving_avg_30d,
-- Centered moving average (3 dagen voor en na)
AVG(sales) OVER (
ORDER BY date
ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING
) AS centered_moving_avg
FROM daily_sales
ORDER BY date;
-- 3. CUMULATIVE COUNT en PERCENTAGE
SELECT
customer_id,
order_date,
order_amount,
-- Aantal orders tot nu toe
COUNT(*) OVER (
PARTITION BY customer_id
ORDER BY order_date
) AS order_number,
-- Percentage van customer total
order_amount * 100.0 / SUM(order_amount) OVER (
PARTITION BY customer_id
) AS pct_of_customer_total,
-- Cumulative percentage van customer total
SUM(order_amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
) * 100.0 / SUM(order_amount) OVER (
PARTITION BY customer_id
) AS cumulative_pct_of_total
FROM orders
ORDER BY customer_id, order_date;
-- 4. MIN en MAX OVER TIME
SELECT
product_id,
price_date,
price,
-- Laagste prijs tot nu toe
MIN(price) OVER (
PARTITION BY product_id
ORDER BY price_date
) AS min_price_to_date,
-- Hoogste prijs tot nu toe
MAX(price) OVER (
PARTITION BY product_id
ORDER BY price_date
) AS max_price_to_date,
-- Verschil met laagste prijs tot nu toe
price - MIN(price) OVER (
PARTITION BY product_id
ORDER BY price_date
) AS diff_from_min,
-- Percentage boven laagste prijs
(price - MIN(price) OVER (
PARTITION BY product_id
ORDER BY price_date
)) * 100.0 /
MIN(price) OVER (
PARTITION BY product_id
ORDER BY price_date
) AS pct_above_min
FROM product_prices
ORDER BY product_id, price_date;
Advanced Aggregate Patterns
-- PATTERN 1: Expanding window vs Sliding window
SELECT
date,
revenue,
-- Expanding window (cumulative)
SUM(revenue) OVER (
ORDER BY date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS cumulative_revenue,
-- Sliding window (last 7 days)
SUM(revenue) OVER (
ORDER BY date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS trailing_7d_revenue,
-- Centered window (3 before, 3 after)
AVG(revenue) OVER (
ORDER BY date
ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING
) AS centered_7d_avg
FROM daily_metrics
ORDER BY date;
-- PATTERN 2: Percentage of total per group
SELECT
department,
employee_name,
salary,
-- Salary als percentage van department total
salary * 100.0 / SUM(salary) OVER (
PARTITION BY department
) AS pct_of_dept_total,
-- Salary als percentage van company total
salary * 100.0 / SUM(salary) OVER () AS pct_of_company_total,
-- Department total als percentage van company
SUM(salary) OVER (
PARTITION BY department
) * 100.0 / SUM(salary) OVER ()
AS dept_pct_of_company
FROM employees
ORDER BY department, salary DESC;
-- PATTERN 3: Running difference from average
SELECT
month,
actual_sales,
forecast_sales,
-- Running average van forecast
AVG(forecast_sales) OVER (
ORDER BY month
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS forecast_moving_avg,
-- Verschil met running average
actual_sales - AVG(forecast_sales) OVER (
ORDER BY month
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS diff_from_forecast_avg,
-- Cumulative forecast error
SUM(actual_sales - forecast_sales) OVER (
ORDER BY month
) AS cumulative_forecast_error
FROM sales_forecast
ORDER BY month;
-- PATTERN 4: Hierarchical cumulative calculations
SELECT
region,
country,
city,
sales_amount,
-- Cumulative per city
SUM(sales_amount) OVER (
PARTITION BY region, country, city
ORDER BY sales_date
) AS cumulative_city_sales,
-- Cumulative per country
SUM(sales_amount) OVER (
PARTITION BY region, country
ORDER BY sales_date
) AS cumulative_country_sales,
-- Cumulative per region
SUM(sales_amount) OVER (
PARTITION BY region
ORDER BY sales_date
) AS cumulative_region_sales,
-- Percentage van region total
sales_amount * 100.0 / SUM(sales_amount) OVER (
PARTITION BY region
) AS pct_of_region
FROM regional_sales
ORDER BY region, country, city, sales_date;
-- PATTERN 5: Gap analysis with COUNT
SELECT
date,
-- Aantal dagen met sales
COUNT(sales_amount) OVER (
ORDER BY date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS days_with_sales,
-- Aantal opeenvolgende dagen met sales
CASE
WHEN sales_amount > 0
THEN ROW_NUMBER() OVER (
PARTITION BY streak_group
ORDER BY date
)
ELSE 0
END AS consecutive_days_streak
FROM (
SELECT
date,
sales_amount,
SUM(CASE WHEN sales_amount = 0 THEN 1 ELSE 0 END) OVER (
ORDER BY date
) AS streak_group
FROM daily_sales
) streaks
ORDER BY date;
6. Window Frame Specificatie (ROWS vs RANGE)
Window Frame Concepts
Window frame specificatie bepaalt precies welke rijen in het window worden opgenomen voor berekeningen. Het onderscheid tussen ROWS en RANGE is cruciaal voor correcte resultaten.
ROWS vs RANGE Comparison
-- ROWS: Fysieke rijen tellen
-- RANGE: Logische waarden groeperen
-- VOORBEELD DATA:
-- date | sales
-- 2024-01-01 | 100
-- 2024-01-01 | 150 (zelfde datum)
-- 2024-01-02 | 200
-- 2024-01-03 | 250
SELECT
date,
sales,
-- ROWS: Telt 2 fysieke rijen terug
SUM(sales) OVER (
ORDER BY date
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS sum_rows_2_preceding,
-- RANGE: Neemt alle rijen metzelfde of eerdere datum binnen range
SUM(sales) OVER (
ORDER BY date
RANGE BETWEEN INTERVAL '2' DAY PRECEDING AND CURRENT ROW
) AS sum_range_2_days,
-- UNBOUNDED PRECEDING: Alles vanaf start
SUM(sales) OVER (
ORDER BY date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS cumulative_sum,
-- BETWEEN n PRECEDING AND m FOLLOWING
AVG(sales) OVER (
ORDER BY date
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
) AS centered_avg
FROM daily_sales
ORDER BY date;
-- RESULTATEN:
-- date | sales | sum_rows_2_preceding | sum_range_2_days | cumulative_sum | centered_avg
-- 2024-01-01 | 100 | 100 | 250 | 100 | 125
-- 2024-01-01 | 150 | 250 (100+150) | 250 | 250 | 150
-- 2024-01-02 | 200 | 450 (150+200+100) | 450 | 450 | 150
-- 2024-01-03 | 250 | 600 (200+250+150) | 700 | 700 | 225
-- ADVANCED FRAME PATTERNS
SELECT
timestamp,
metric_value,
-- Rolling window van laatste 5 minuten (RANGE)
AVG(metric_value) OVER (
ORDER BY timestamp
RANGE BETWEEN INTERVAL '5' MINUTE PRECEDING AND CURRENT ROW
) AS avg_last_5_minutes,
-- Expanding window (standaard als ORDER BY zonder frame)
SUM(metric_value) OVER (
ORDER BY timestamp
) AS default_expanding_sum,
-- Explicit hele window
SUM(metric_value) OVER (
ORDER BY timestamp
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS total_sum,
-- Window zonder CURRENT ROW
SUM(metric_value) OVER (
ORDER BY timestamp
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
) AS sum_excluding_current,
-- Only following rows
AVG(metric_value) OVER (
ORDER BY timestamp
ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING
) AS avg_next_3
FROM time_series_metrics
ORDER BY timestamp;
Frame Boundary Options
-- COMPLETE FRAME BOUNDARY REFERENCE
-- 1. UNBOUNDED PRECEDING: All rows from partition start
SUM(amount) OVER (
ORDER BY date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) -- Default frame bij ORDER BY zonder frame specificatie
-- 2. n PRECEDING: n rows before current row
AVG(temperature) OVER (
ORDER BY reading_time
ROWS BETWEEN 5 PRECEDING AND CURRENT ROW
) -- Moving average laatste 6 readings
-- 3. CURRENT ROW: Current row only
ROW_NUMBER() OVER (
ORDER BY score
ROWS BETWEEN CURRENT ROW AND CURRENT ROW
) -- Niet praktisch, maar illustratief
-- 4. n FOLLOWING: n rows after current row
MIN(price) OVER (
ORDER BY effective_date
ROWS BETWEEN CURRENT ROW AND 30 FOLLOWING
) -- Minimum prijs in komende 30 dagen
-- 5. UNBOUNDED FOLLOWING: All rows to partition end
LAST_VALUE(amount) OVER (
ORDER BY date
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
) -- Laatste waarde vanaf huidige rij
-- 6. MIXED BOUNDARIES: Complex frames
AVG(value) OVER (
ORDER BY timestamp
ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING
) -- Centered 5-point moving average
-- 7. RANGE met INTERVAL: Voor datetime columns
SUM(revenue) OVER (
ORDER BY transaction_time
RANGE BETWEEN INTERVAL '1' HOUR PRECEDING AND CURRENT ROW
) -- Revenue laatste uur
-- 8. RANGE met numerieke offsets
COUNT(*) OVER (
ORDER BY score
RANGE BETWEEN 10 PRECEDING AND CURRENT ROW
) -- Aantal scores binnen 10 punten van huidige
-- PRACTICAL EXAMPLE: Complex business calculation
SELECT
employee_id,
evaluation_date,
performance_score,
-- Average van huidige en vorige 2 evaluaties
AVG(performance_score) OVER (
PARTITION BY employee_id
ORDER BY evaluation_date
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS moving_avg_3_evaluations,
-- Trend: gemiddelde laatste 2 vs eerste 2
AVG(performance_score) OVER (
PARTITION BY employee_id
ORDER BY evaluation_date
ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
) -
AVG(performance_score) OVER (
PARTITION BY employee_id
ORDER BY evaluation_date
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 FOLLOWING
) AS trend_vs_early_performance,
-- Percentage van beste score in laatste jaar
performance_score * 100.0 /
MAX(performance_score) OVER (
PARTITION BY employee_id
ORDER BY evaluation_date
RANGE BETWEEN INTERVAL '1' YEAR PRECEDING AND CURRENT ROW
) AS pct_of_best_last_year
FROM performance_reviews
ORDER BY employee_id, evaluation_date;
Klaar voor Geavanceerde SQL Projecten?
Vind de juiste SQL experts of plaats je vacature voor complexe data analytics projecten
7. PARTITION BY Deep Dive
Data Partitioning Strategieën
PARTITION BY verdeelt data in onafhankelijke windows voor parallelle berekeningen. Het is vergelijkbaar met GROUP BY maar behoudt individuele rijen.
Multi-Column Partitioning
-- MULTI-LEVEL PARTITIONING: Complex business scenarios
-- Scenario: Retail sales analysis per store, department, en product category
SELECT
store_id,
department_id,
category_id,
product_id,
sales_date,
daily_sales,
-- Ranking binnen category per store
RANK() OVER (
PARTITION BY store_id, department_id, category_id
ORDER BY daily_sales DESC
) AS sales_rank_in_category,
-- Percentage van category total per store
daily_sales * 100.0 /
SUM(daily_sales) OVER (
PARTITION BY store_id, department_id, category_id
) AS pct_of_category_sales,
-- Running total per store-department
SUM(daily_sales) OVER (
PARTITION BY store_id, department_id
ORDER BY sales_date
) AS running_total_dept,
-- Verschil met gemiddelde vanzelfde category in andere stores
daily_sales -
AVG(daily_sales) OVER (
PARTITION BY department_id, category_id, sales_date
) AS diff_from_category_avg_all_stores,
-- Percentage boven/beneden store gemiddelde voor category
(daily_sales -
AVG(daily_sales) OVER (
PARTITION BY store_id, department_id, category_id
)) * 100.0 /
AVG(daily_sales) OVER (
PARTITION BY store_id, department_id, category_id
) AS pct_vs_store_category_avg
FROM retail_sales_daily
WHERE sales_date BETWEEN '2024-01-01' AND '2024-12-31'
ORDER BY store_id, department_id, category_id, sales_rank_in_category;
-- HIERARCHICAL PARTITIONING: Drill-down analysis
SELECT
region,
country,
city,
sales_person,
sales_amount,
sales_date,
-- Percentage van region total
sales_amount * 100.0 /
SUM(sales_amount) OVER (
PARTITION BY region
) AS pct_of_region,
-- Percentage van country total binnen region
sales_amount * 100.0 /
SUM(sales_amount) OVER (
PARTITION BY region, country
) AS pct_of_country,
-- Percentage van city total binnen country
sales_amount * 100.0 /
SUM(sales_amount) OVER (
PARTITION BY region, country, city
) AS pct_of_city,
-- Ranking binnen city
ROW_NUMBER() OVER (
PARTITION BY region, country, city
ORDER BY sales_amount DESC
) AS rank_in_city,
-- Cumulative per sales person in city
SUM(sales_amount) OVER (
PARTITION BY region, country, city, sales_person
ORDER BY sales_date
) AS cumulative_per_person
FROM regional_sales
ORDER BY region, country, city, rank_in_city;
8. Geavanceerde Patterns en Use Cases
Complex Business Scenarios
Geavanceerde window function patterns lossen complexe business problemen op die anders multiple self-joins of procedural code zouden vereisen.
Advanced Business Patterns
-- PATTERN 1: Sessionization - Group sequential events
WITH UserSessions AS (
SELECT
user_id,
event_timestamp,
event_type,
-- Bepaal of er een gap van > 30 minuten was
CASE
WHEN DATEDIFF(
minute,
LAG(event_timestamp) OVER (
PARTITION BY user_id
ORDER BY event_timestamp
),
event_timestamp
) > 30
THEN 1
ELSE 0
END AS is_new_session,
-- Genereer session ID
SUM(
CASE
WHEN DATEDIFF(
minute,
LAG(event_timestamp) OVER (
PARTITION BY user_id
ORDER BY event_timestamp
),
event_timestamp
) > 30
THEN 1
ELSE 0
END
) OVER (
PARTITION BY user_id
ORDER BY event_timestamp
) AS session_id
FROM user_events
WHERE event_date = '2024-12-20'
)
SELECT
user_id,
session_id,
MIN(event_timestamp) AS session_start,
MAX(event_timestamp) AS session_end,
COUNT(*) AS event_count,
DATEDIFF(minute, MIN(event_timestamp), MAX(event_timestamp)) AS session_duration_minutes
FROM UserSessions
GROUP BY user_id, session_id
ORDER BY user_id, session_start;
-- PATTERN 2: Gap and Island analysis
WITH NumberedStatus AS (
SELECT
equipment_id,
status_date,
status,
-- Groep opeenvolgende dagen metzelfde status
ROW_NUMBER() OVER (ORDER BY status_date) -
ROW_NUMBER() OVER (
PARTITION BY equipment_id, status
ORDER BY status_date
) AS island_group
FROM equipment_status
WHERE equipment_id = 123
)
SELECT
equipment_id,
status,
MIN(status_date) AS period_start,
MAX(status_date) AS period_end,
DATEDIFF(day, MIN(status_date), MAX(status_date)) + 1 AS days_in_status
FROM NumberedStatus
GROUP BY equipment_id, status, island_group
ORDER BY period_start;
-- PATTERN 3: Customer cohort analysis
WITH FirstPurchase AS (
SELECT
customer_id,
MIN(order_date) AS first_purchase_date,
DATE_TRUNC('month', MIN(order_date)) AS cohort_month
FROM orders
GROUP BY customer_id
),
MonthlyActivity AS (
SELECT
fp.customer_id,
fp.cohort_month,
DATE_TRUNC('month', o.order_date) AS activity_month,
COUNT(DISTINCT o.order_id) AS order_count,
SUM(o.order_amount) AS total_spent
FROM FirstPurchase fp
JOIN orders o ON fp.customer_id = o.customer_id
GROUP BY fp.customer_id, fp.cohort_month, DATE_TRUNC('month', o.order_date)
)
SELECT
cohort_month,
activity_month,
DATEDIFF(month, cohort_month, activity_month) AS months_since_first_purchase,
COUNT(DISTINCT customer_id) AS active_customers,
AVG(order_count) AS avg_orders_per_customer,
AVG(total_spent) AS avg_spent_per_customer,
-- Retention rate vs first month
COUNT(DISTINCT customer_id) * 100.0 /
FIRST_VALUE(COUNT(DISTINCT customer_id)) OVER (
PARTITION BY cohort_month
ORDER BY activity_month
) AS retention_rate_pct
FROM MonthlyActivity
GROUP BY cohort_month, activity_month
ORDER BY cohort_month, activity_month;
9. Performance Optimalisatie
Window Functions Performance Tips
Performance optimalisatie voor window functions is essentieel bij grote datasets. Goede praktijken kunnen queries 10-100x versnellen.
Performance Best Practices
-- PERFORMANCE OPTIMALISATIE TIPS
-- 1. Gebruik indexes op PARTITION BY en ORDER BY columns
CREATE INDEX idx_orders_customer_date
ON orders(customer_id, order_date);
-- 2. Beperk window size waar mogelijk
-- SLECHT: Hele tabel in window
SUM(amount) OVER (PARTITION BY customer_id ORDER BY date)
-- BETER: Beperk tot laatste 90 dagen
SUM(amount) OVER (
PARTITION BY customer_id
ORDER BY date
ROWS BETWEEN 89 PRECEDING AND CURRENT ROW
)
-- 3. Vermijd nested window functions
-- SLECHT: Nested calculations
SELECT
AVG(SUM(amount) OVER (PARTITION BY customer_id))
OVER (PARTITION BY region)
FROM orders;
-- BETER: Gebruik CTEs
WITH CustomerTotals AS (
SELECT
customer_id,
region,
SUM(amount) AS total_amount
FROM orders
GROUP BY customer_id, region
)
SELECT
region,
AVG(total_amount) OVER (PARTITION BY region)
FROM CustomerTotals;
-- 4. Filter resultaten buiten de window function
-- SLECHT: Filter binnen window function
SELECT
SUM(CASE WHEN status = 'completed' THEN amount ELSE 0 END)
OVER (PARTITION BY customer_id)
FROM orders;
-- BETER: Filter eerst, dan window function
WITH CompletedOrders AS (
SELECT customer_id, amount
FROM orders
WHERE status = 'completed'
)
SELECT
customer_id,
SUM(amount) OVER (PARTITION BY customer_id)
FROM CompletedOrders;
-- 5. Gebruik materialized views voor vaak gebruikte windows
CREATE MATERIALIZED VIEW mv_customer_running_totals AS
SELECT
customer_id,
order_date,
order_amount,
SUM(order_amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
) AS running_total
FROM orders;
-- 6. Vermijd RANGE bij grote datasets
-- ROWS is meestal sneller dan RANGE
-- RANGE kan leiden tot onverwachte performance issues
-- 7. Partitionering optimalisatie
EXPLAIN ANALYZE
SELECT
customer_id,
SUM(amount) OVER (PARTITION BY customer_id)
FROM orders;
-- 8. Batch processing voor zeer grote datasets
CREATE PROCEDURE process_large_window_batches()
AS $$
DECLARE
batch_size INT := 100000;
offset_val INT := 0;
BEGIN
WHILE TRUE LOOP
INSERT INTO processed_results
SELECT
customer_id,
order_date,
SUM(amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
) AS running_total
FROM orders
WHERE customer_id BETWEEN offset_val AND offset_val + batch_size
ORDER BY customer_id, order_date;
offset_val := offset_val + batch_size;
EXIT WHEN offset_val > (SELECT MAX(customer_id) FROM orders);
END LOOP;
END;
$$ LANGUAGE plpgsql;
10. Real-World Business Examples
Praktische Business Use Cases
Real-world voorbeelden tonen hoe window functions worden gebruikt in verschillende industrieën voor kritieke business analyses.
E-commerce: Customer Behavior Analysis
-- COMPLETE CUSTOMER JOURNEY ANALYSIS
WITH CustomerJourney AS (
SELECT
customer_id,
session_id,
event_timestamp,
event_type,
page_url,
-- Time since previous event
DATEDIFF(
second,
LAG(event_timestamp) OVER (
PARTITION BY customer_id, session_id
ORDER BY event_timestamp
),
event_timestamp
) AS seconds_since_prev_event,
-- Session duration so far
DATEDIFF(
second,
FIRST_VALUE(event_timestamp) OVER (
PARTITION BY customer_id, session_id
ORDER BY event_timestamp
),
event_timestamp
) AS seconds_in_session,
-- Page sequence number
ROW_NUMBER() OVER (
PARTITION BY customer_id, session_id
ORDER BY event_timestamp
) AS page_sequence,
-- Conversion flag (purchase event)
MAX(CASE WHEN event_type = 'purchase' THEN 1 ELSE 0 END) OVER (
PARTITION BY customer_id, session_id
) AS session_converted
FROM customer_events
WHERE event_date = '2024-12-20'
)
SELECT
customer_id,
session_id,
COUNT(*) AS total_events,
MIN(event_timestamp) AS session_start,
MAX(event_timestamp) AS session_end,
MAX(seconds_in_session) AS session_duration_seconds,
AVG(seconds_since_prev_event) AS avg_time_between_events,
MAX(session_converted) AS converted,
-- Customer's average session duration
AVG(MAX(seconds_in_session)) OVER (
PARTITION BY customer_id
) AS customer_avg_session_duration,
-- Percentage of customer's total sessions
100.0 / COUNT(*) OVER (
PARTITION BY customer_id
) AS pct_of_customer_sessions
FROM CustomerJourney
GROUP BY customer_id, session_id
ORDER BY customer_id, session_start;
Financial Services: Fraud Detection
-- ADVANCED FRAUD DETECTION PATTERNS
WITH TransactionAnalysis AS (
SELECT
transaction_id,
customer_id,
transaction_timestamp,
transaction_amount,
merchant_id,
location,
-- Time since last transaction
DATEDIFF(
minute,
LAG(transaction_timestamp) OVER (
PARTITION BY customer_id
ORDER BY transaction_timestamp
),
transaction_timestamp
) AS minutes_since_last_tx,
-- Amount difference from average
transaction_amount -
AVG(transaction_amount) OVER (
PARTITION BY customer_id
ORDER BY transaction_timestamp
ROWS BETWEEN 9 PRECEDING AND 1 PRECEDING
) AS amount_diff_from_avg,
-- Unusual location pattern
CASE
WHEN location !=
LAG(location, 1) OVER (
PARTITION BY customer_id
ORDER BY transaction_timestamp
)
AND DATEDIFF(
minute,
LAG(transaction_timestamp, 1) OVER (
PARTITION BY customer_id
ORDER BY transaction_timestamp
),
transaction_timestamp
) < 30
THEN 1
ELSE 0
END AS rapid_location_change,
-- Transaction velocity (last hour)
COUNT(*) OVER (
PARTITION BY customer_id
ORDER BY transaction_timestamp
RANGE BETWEEN INTERVAL '1' HOUR PRECEDING AND CURRENT ROW
) AS tx_count_last_hour,
-- Total amount last hour
SUM(transaction_amount) OVER (
PARTITION BY customer_id
ORDER BY transaction_timestamp
RANGE BETWEEN INTERVAL '1' HOUR PRECEDING AND CURRENT ROW
) AS amount_last_hour
FROM transactions
WHERE transaction_date = '2024-12-20'
)
SELECT
*,
-- Fraud risk score (0-100)
CASE
WHEN minutes_since_last_tx < 1 THEN 20
ELSE 0
END +
CASE
WHEN ABS(amount_diff_from_avg) >
AVG(transaction_amount) OVER (
PARTITION BY customer_id
) * 3
THEN 30
ELSE 0
END +
CASE
WHEN rapid_location_change = 1 THEN 25
ELSE 0
END +
CASE
WHEN tx_count_last_hour > 10 THEN 15
WHEN tx_count_last_hour > 5 THEN 10
ELSE 0
END +
CASE
WHEN amount_last_hour > 10000 THEN 10
ELSE 0
END AS fraud_risk_score
FROM TransactionAnalysis
WHERE fraud_risk_score >= 50 -- High risk threshold
ORDER BY fraud_risk_score DESC;
11. Window Functions vs GROUP BY vs Self-Joins
Performance en Readability Comparison
Vergelijking van verschillende benaderingen voor complexe analytics helpt bij het kiezen van de juiste oplossing voor elk scenario.
Side-by-Side Comparison
-- USE CASE: Running total per customer
-- 1. WINDOW FUNCTION (BEST)
SELECT
customer_id,
order_date,
order_amount,
SUM(order_amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
) AS running_total
FROM orders
ORDER BY customer_id, order_date;
-- 2. SELF-JOIN (INEFFICIENT)
SELECT
o1.customer_id,
o1.order_date,
o1.order_amount,
SUM(o2.order_amount) AS running_total
FROM orders o1
JOIN orders o2 ON o1.customer_id = o2.customer_id
AND o2.order_date <= o1.order_date
GROUP BY o1.customer_id, o1.order_date, o1.order_amount
ORDER BY o1.customer_id, o1.order_date;
-- 3. CORRELATED SUBQUERY (WORST)
SELECT
customer_id,
order_date,
order_amount,
(
SELECT SUM(order_amount)
FROM orders o2
WHERE o2.customer_id = o1.customer_id
AND o2.order_date <= o1.order_date
) AS running_total
FROM orders o1
ORDER BY customer_id, order_date;
-- PERFORMANCE STATISTIEKEN (voorbeeld op 1M rijen):
-- Window Function: 2 seconden, 1 tabel scan
-- Self-Join: 45 seconden, O(n²) complexity
-- Correlated Subquery: 3+ minuten, n subqueries
-- USE CASE: Gap analysis (days between orders)
-- 1. WINDOW FUNCTION (CLEAN)
SELECT
customer_id,
order_date,
LAG(order_date) OVER (
PARTITION BY customer_id
ORDER BY order_date
) AS previous_order_date,
DATEDIFF(
day,
LAG(order_date) OVER (
PARTITION BY customer_id
ORDER BY order_date
),
order_date
) AS days_since_previous
FROM orders;
-- 2. ALTERNATIVE APPROACHES zijn veel complexer en minder efficient
-- KEY TAKEAWAYS:
-- 1. Window functions zijn meestal de beste keuze
-- 2. Ze verbeteren performance dramatisch
-- 3. Code is leesbaarder en onderhoudbaarder
-- 4. Minder kans op bugs en errors
-- 5. Betere optimizer support in moderne databases
Conclusie en Best Practices Checklist
Window Functions Cheat Sheet
Ranking Functions
ROW_NUMBER()- Unieke nummersRANK()- Gelijk bij ties, met gapsDENSE_RANK()- Gelijk bij ties, geen gapsNTILE(n)- Verdeel in n groepen
Analytic Functions
LEAD(expr, n)- Volgende rijLAG(expr, n)- Vorige rijFIRST_VALUE(expr)- Eerste in windowLAST_VALUE(expr)- Laatste in window
Aggregate Functions
SUM(expr) OVER- Running totalAVG(expr) OVER- Moving averageCOUNT(expr) OVER- Cumulative countMIN/MAX(expr) OVER- Running min/max
Mastery Checklist
- Begrijp het verschil tussen ROWS en RANGE
- Weet wanneer PARTITION BY te gebruiken
- Beheers window frame specificatie
- Kan complexe business scenarios oplossen
- Begrijp performance implicaties
- Kan window functions combineren
- Weet alternatieve benaderingen te vermijden
- Kan query's debuggen en optimaliseren
Veelgestelde Vragen (FAQ)
Q: Zijn window functions beschikbaar in alle databases?
A: Ja, window functions worden ondersteund in: PostgreSQL, MySQL 8.0+, SQL Server 2005+, Oracle, SQLite 3.25+, Snowflake, Redshift, BigQuery. Alleen MySQL vóór versie 8.0 heeft beperkte ondersteuning.
Q: Kan ik window functions gebruiken in WHERE clause?
A: Nee, window functions kunnen niet direct in WHERE gebruikt worden. Gebruik een subquery of CTE: SELECT * FROM (SELECT ..., ROW_NUMBER() OVER() rn FROM table) WHERE rn = 1
Q: Wat is het verschil tussen ROWS en RANGE bij dubbele waarden?
A: Bij dubbele ORDER BY waarden: ROWS behandelt elke rij apart, RANGE groepeert rijen met gelijke waarden. RANGE kan leiden tot onverwachte resultaten bij ties.
Q: Hoe sorteer ik window function resultaten?
A: Window function ORDER BY bepaalt de window volgorde. Query ORDER BY bepaalt de uiteindelijke uitvoervolgorde. Ze kunnen verschillend zijn.