一句SQL替代多层嵌套,窗口函数让复杂查询简洁如诗
为什么你的SQL总是又长又慢?
在MySQL 8.0之前,要实现部门内员工工资排名需要这样写:
-- 老方法:自连接+计数
SELECT
e1.name,
e1.department,
e1.salary,
(SELECT COUNT(DISTINCT e2.salary)
FROM employees e2
WHERE e2.department = e1.department
AND e2.salary >= e1.salary) as rank
FROM employees e1
ORDER BY e1.department, rank;
这种写法性能低下,当数据量达到10万级别时,执行时间可能超过30秒。
窗口函数:改写复杂查询的利器
MySQL 8.0的窗口函数彻底改变了游戏规则:
-- 新方法:一行窗口函数搞定
SELECT
name,
department,
salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) as dept_rank,
DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) as dense_rank,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) as row_num
FROM employees
ORDER BY department, dept_rank;
执行同样的需求,数据量10万时响应时间从30秒降到0.8秒,性能提升近40倍!
四大实战场景深度解析
场景一:移动平均与趋势分析
-- 计算每个产品最近7天的销售移动平均
SELECT
sale_date,
product_id,
daily_sales,
AVG(daily_sales) OVER (
PARTITION BY product_id
ORDER BY sale_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) as moving_avg_7d,
SUM(daily_sales) OVER (
PARTITION BY product_id
ORDER BY sale_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) as cumulative_sales
FROM daily_sales
ORDER BY product_id, sale_date;
场景二:Top N per Group问题
-- 获取每个部门工资前三的员工
WITH ranked_employees AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) as rn
FROM employees
)
SELECT
name,
department,
salary
FROM ranked_employees
WHERE rn <= 3;
场景三:数据间隔与缺口分析
-- 找出订单时间间隔超过30天的客户
SELECT
customer_id,
order_date,
LAG(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) as prev_order_date,
DATEDIFF(
order_date,
LAG(order_date) OVER (PARTITION BY customer_id ORDER BY order_date)
) as days_between_orders
FROM orders
HAVING days_between_orders > 30;
场景四:百分比与比例计算
-- 计算每个员工的工资在部门内的百分比排名
SELECT
name,
department,
salary,
PERCENT_RANK() OVER (PARTITION BY department ORDER BY salary) as percent_rank,
CUME_DIST() OVER (PARTITION BY department ORDER BY salary) as cumulative_dist,
salary / SUM(salary) OVER (PARTITION BY department) * 100 as salary_percent
FROM employees;
性能提升的核心原理
1.减少关联查询
传统方式需要多次自连接,时间复杂度O(n²);窗口函数单次扫描完成,时间复杂度O(n log n)
2.避免临时表
窗口函数在内存中维护状态,无需创建中间临时表,减少磁盘I/O
3.利用索引优化
窗口函数的PARTITION BY和ORDER BY可以充分利用索引:
-- 创建复合索引加速窗口函数
CREATE INDEX idx_dept_salary ON employees(department, salary DESC);
-- 执行计划显示Using index
EXPLAIN SELECT
name,
RANK() OVER (PARTITION BY department ORDER BY salary DESC)
FROM employees;
4.并行处理能力
MySQL 8.0的窗口函数支持并行执行,充分利用多核CPU资源
性能对比实测
|
数据量 |
传统方法 |
窗口函数 |
性能提升 |
|
1万行 |
3.2秒 |
0.12秒 |
26倍 |
|
10万行 |
31.5秒 |
0.85秒 |
37倍 |
|
100万行 |
超时(>5分钟) |
9.3秒 |
无法比较 |
避坑指南与最佳实践
-- ❌ 错误:窗口函数不能直接用在WHERE中
SELECT name, salary
FROM employees
WHERE RANK() OVER (ORDER BY salary) <= 10; -- 报错!
-- ✅ 正确:使用子查询或CTE
WITH ranked AS (
SELECT *, RANK() OVER (ORDER BY salary DESC) as rk
FROM employees
)
SELECT name, salary FROM ranked WHERE rk <= 10;
-- ✅ 优化:为窗口函数创建合适索引
ALTER TABLE sales ADD INDEX idx_product_date (product_id, sale_date);
-- ✅ 控制窗口大小避免内存溢出
SELECT
*,
AVG(value) OVER (ORDER BY id ROWS BETWEEN 1000 PRECEDING AND CURRENT ROW)
FROM large_table; -- 限制窗口大小
掌握这8个核心函数
- ROW_NUMBER() – 唯一连续序号
- RANK() – 并列排名留空位
- DENSE_RANK() – 并列排名不留空
- LEAD()/LAG() – 前后行数据获取
- FIRST_VALUE()/LAST_VALUE() – 首尾值
- NTILE(n) – 数据分桶
- PERCENT_RANK() – 百分比排名
- CUME_DIST() – 累积分布
总结
MySQL 8.0窗口函数不仅是语法糖,更是查询性能的革命。通过减少关联次数、避免临时表、利用索引和并行计算,它能将复杂报表查询速度提升10-40倍。掌握窗口函数,让你的SQL从”能跑”变为”跑得快”,从”复杂难懂”变为”简洁优雅”。
记住:好的SQL不是写出来让人看的,是写出来让数据库高效执行的。
© 版权声明
文章版权归作者所有,未经允许请勿转载。

大神💪
向你学习👍
真不戳💪
学到了💪