这两个查询结果有什么区别?如果查询结果没有区别,第二个查询的choose部分有优化空间么? ```x...
創建於:2026年1月21日
使用 GPT-5.2 Thinking 回答 Chat01
創建於:2026年1月21日
使用 GPT-5.2 Thinking 回答 Chat01
这两个查询结果有什么区别?如果查询结果没有区别,第二个查询的choose部分有优化空间么?
xmlWITH ranked_records AS ( SELECT *, ROW_NUMBER() OVER ( PARTITION BY date, replace_type, station_id, period_type, discharge_type, special_type ORDER BY CASE WHEN source_type = 'REPORTED' THEN 0 ELSE 1 END ) AS rn FROM power_station_discharge <where> <if test="replaceType != null"> AND replace_type = #{replaceType} </if> <if test="stationIds != null and stationIds.size > 0"> AND station_id IN <foreach collection="stationIds" item="item" separator="," open="(" close=")"> #{item} </foreach> </if> <if test="startDate != null"> AND date >= #{startDate} </if> <if test="endDate != null"> AND #{endDate} >= date </if> </where> ) SELECT * FROM ranked_records <where> <choose> <when test="sourceType != null and sourceType.name() == 'AUDIT'"> AND source_type = 'AUDIT' </when> <when test="sourceType == null"> </when> <otherwise> AND rn = 1 </otherwise> </choose> </where>
xml<select id="listByCondition" resultType="com.wiscom.discharge.model.entity.PowerStationDischarge"> WITH ranked_records AS ( SELECT *, ROW_NUMBER() OVER ( PARTITION BY date, replace_type, station_id, period_type, discharge_type, special_type ORDER BY CASE WHEN source_type = 'REPORTED' THEN 0 ELSE 1 END ) AS rn FROM power_station_discharge <where> <if test="sourceType != null and sourceType.name() == 'AUDIT'"> AND source_type = 'AUDIT' </if> <if test="replaceType != null"> AND replace_type = #{replaceType} </if> <if test="stationIds != null and stationIds.size > 0"> AND station_id IN <foreach collection="stationIds" item="item" separator="," open="(" close=")"> #{item} </foreach> </if> <if test="startDate != null"> AND date >= #{startDate} </if> <if test="endDate != null"> AND #{endDate} >= date </if> </where> ) SELECT * FROM ranked_records <where> <choose> <when test="sourceType != null and sourceType.name() == 'AUDIT'"> </when> <when test="sourceType == null"> </when> <otherwise> AND rn = 1 </otherwise> </choose> </where> </select>
下面按 sourceType 的三种取值把两段 SQL 的“实际返回行集合”和“差异点”拆开说(因为差异主要集中在 AUDIT 场景的 rn 值 和 执行代价)。
sourceType = AUDITsource_type,先对“所有来源”的记录做 ROW_NUMBER();外层再 AND source_type='AUDIT'。
REPORTED,那 AUDIT 往往会是 rn=2/3...。AND source_type='AUDIT',只对 AUDIT 子集做 ROW_NUMBER();外层不再加 source_type 条件。
rn=1/2...(与第 1 段可能不同)。✅ 结论:** 返回的业务数据行集合基本一致**(都是 AUDIT 全量),但**rn 列的值可能不同**;并且第 2 段通常更省(窗口函数处理的行更少)。
如果你的
resultType实体里根本没有rn字段(通常没有),那“结果差异”主要就体现在性能/执行计划上,而不是业务字段值上。
sourceType = null两段都不会加 source_type='AUDIT',外层 choose 也不会加 rn=1,所以:
✅ 两段都返回:满足其他条件的全量记录(不去重)。
(很多人这里期望默认去重,但你现在的逻辑是“sourceType 为空就全量”。)
sourceType != null 且 != AUDIT(比如 REPORTED / 其它枚举)两段都不会在 CTE 里按 source_type=xxx 过滤,外层都会走 otherwise AND rn=1:
✅ 两段都返回:每个分组最多 1 条(优先 REPORTED,否则其它)。
⚠️ 注意:此时 并不会保证返回的 source_type 等于你传入的 sourceType;传入值只用于“是否去重”,不用于“按该来源过滤”。
<choose> 有优化空间吗?有,而且是“可读性 + 少生成空分支”的优化:你现在的 <choose> 只有 otherwise 真正输出 SQL,前两个 <when> 是空的,本质就是:
仅当
sourceType != null && sourceType != AUDIT时加AND rn = 1
直接改成一个 <if> 更清晰:
xmlSELECT * FROM ranked_records <where> <if test="sourceType != null and sourceType.name() != 'AUDIT'"> AND rn = 1 </if> </where>
这样可以把整个 <choose> 去掉(功能等价于你第 2 段当前写法)。
把“是否 AUDIT 过滤”留在 CTE(你第 2 段做得对,能减少窗口函数工作量),外层只负责“是否 rn=1 去重”:
xmlWITH ranked_records AS ( SELECT *, ROW_NUMBER() OVER ( PARTITION BY date, replace_type, station_id, period_type, discharge_type, special_type ORDER BY CASE WHEN source_type = 'REPORTED' THEN 0 ELSE 1 END ) AS rn FROM power_station_discharge <where> <if test="sourceType != null and sourceType.name() == 'AUDIT'"> AND source_type = 'AUDIT' </if> ...其它条件... </where> ) SELECT * FROM ranked_records <where> <if test="sourceType != null and sourceType.name() != 'AUDIT'"> AND rn = 1 </if> </where>
sourceType 情况下基本一致;sourceType=AUDIT 时返回行相同,但 rn 值可能不同;when 建议改成单个 <if> 输出 rn=1 条件。查询SQL如下,表结构和查询计划如下,瓶颈在哪?有哪些优化方案?
sqlWITH ranked_records AS ( SELECT *, ROW_NUMBER() OVER ( PARTITION BY date, replace_type, station_id, period_type, discharge_type, special_type ORDER BY CASE WHEN source_type = 'REPORTED' THEN 0 ELSE 1 END ) AS rn FROM power_station_discharge WHERE date BETWEEN '2025-01-01' AND '2025-12-01' AND source_type = 'AUDIT' AND replace_type = 'POWER_REPLACE_DEFAULT' AND station_id IN (200000053, 200000054, 200000032, 102000530, 200000100, 102000528, 200000091, 200000074, 200000058, 200000062, 102000529, 102000511, 200000078, 200000089, 200000038, 102000406, 200000098, 103000001, 102000205, 200000073, 200000069, 102000287, 102000446, 200000057, 200000029, 200000061, 200000033, 200000036, 200000059, 102000365, 102000411, 102000377, 102000427, 102000413, 102000412, 102000515, 200000044, 102000501, 200000067, 200000048, 102000517, 102000430, 102000410, 102000032, 102000072, 102000009, 102000342, 102000418, 200000019, 102000527, 102000354, 102000395, 102000070, 200000011, 102000518, 200000097, 102000426, 102000328, 200000021, 102000521, 200000027, 102000439, 102000381, 102000386, 102000066, 200000092, 200000042, 102000349, 200000034, 200000047, 200000016, 102000436, 102000265, 102000306, 102000245, 200000080, 200000077, 200000099, 102000061, 102000356, 102000048, 102000041, 102000015, 102000012, 102000034, 102000053, 102000044, 102000038, 102000429, 102000433, 102000417, 102000382, 102000371, 102000010, 200000093, 200000084, 200000088, 102000525, 200000082, 200000087, 102000357, 102000325, 200000085, 102000369, 102000415, 102000401, 102000403, 102000392, 102000344, 102000337, 102000347, 102000285, 102000071, 102000185, 102000361, 102000341, 102000397, 102000396, 102000513, 200000094, 102000326, 102000065, 102000389, 102000419, 102000368, 102000444, 102000425, 102000037, 102000423, 102000374, 102000067, 102000327, 200000028, 200000066, 200000043, 102000391, 102000384, 200000086, 102000526, 102000348, 102000058, 102000338, 102000504, 102000040, 102000054, 102000400, 102000335, 102000435, 102000432, 102000522, 102000339, 102000051, 102000330, 102000398, 102000027, 102000393, 102000405, 102000409, 102000443, 102000045, 102000524, 102000503, 102000043, 200000030, 200000046, 102000532, 102000307, 102000380, 102000506, 102000226, 102000388, 102000364, 102000376, 102000308, 102000366, 102000028, 102000362, 200000000, 102000363, 102000343, 102000016, 102000031, 102000060, 102000390, 102000014, 102000422, 102000519, 102000228, 102000447, 102000428, 102000360, 102000523, 200000060, 102000373, 102000068, 102000370, 102000029, 102000346, 102000375, 102000505, 102000047, 102000445, 102000500, 102000510, 102000502, 102000207, 102000206, 200000083, 200000075, 102000056, 102000050, 200000023, 102000437, 102000358, 102000011, 102000340, 102000394, 102000448, 200000020, 102000449, 200000018, 200000095, 200000090, 200000096, 102000186, 102000017, 102000052, 102000036, 200000026, 102000059, 102000030, 102000531, 200000015, 200000051, 200000039, 200000065, 200000025, 102000055, 102000512, 102000230, 102000026, 102000441, 102000355, 200000064, 200000031, 200000072, 200000040, 102000046, 102000062, 102000404, 102000534, 102000520, 102000064, 102000305, 102000345, 102000509, 102000351, 102000399, 200000063, 200000056, 200000070, 102000416, 200000035, 200000071, 200000050, 200000055, 200000041, 200000037, 200000045, 102000378, 102000334, 102000286, 102000379, 102000333, 102000533, 102000350, 102000033, 102000013, 102000063, 102000408, 102000431, 102000372, 102000442, 102000438, 102000069, 200000017, 200000049, 200000079, 200000022, 102000508, 102000516, 102000266, 102000329, 102000352, 102000359, 102000042, 102000336, 102000331, 102000074, 102000252, 200000052, 200000076, 102000421, 102000420, 102000414, 102000383, 102000387, 102000424, 102000367, 200000024, 102000035, 102000402, 102000507, 102000434, 102000229, 102000008, 102000049, 102000407, 102000353, 102000057, 102000385, 102000073, 200000081) ) SELECT * FROM ranked_records
表结构
sql-- auto-generated definition create table power_station_discharge ( date date not null, source_type varchar(64) not null, replace_type varchar(64) not null, station_id bigint not null, period_type varchar(64) not null, discharge_type varchar(64) not null, special_type varchar(64) not null, measured_electricity numeric(10, 4), predict_electricity numeric(10, 4), discharge numeric(10, 4), create_time timestamp(0), update_time timestamp(0) ); comment on table power_station_discharge is '每日单站弃电记录表'; comment on column power_station_discharge.date is '日期'; comment on column power_station_discharge.source_type is '功率来源类型'; comment on column power_station_discharge.replace_type is '替换类型'; comment on column power_station_discharge.station_id is '场站id'; comment on column power_station_discharge.period_type is '弃电来源:(弃电时段类型)'; comment on column power_station_discharge.discharge_type is '弃电类型(特殊/常规)'; comment on column power_station_discharge.special_type is '特殊弃电类型(市场交易决策不当、并网技术条件不满足、特殊天气、无特殊类型)'; comment on column power_station_discharge.measured_electricity is '弃电时段内实测电量(万千瓦时)'; comment on column power_station_discharge.predict_electricity is '弃电时段内预测电量(万千瓦时)'; comment on column power_station_discharge.discharge is '总弃电量(万千瓦时)'; comment on column power_station_discharge.create_time is '创建时间'; comment on column power_station_discharge.update_time is '更新时间'; alter table power_station_discharge owner to postgres; create unique index power_station_discharge_unique_index on power_station_discharge (date, source_type, replace_type, station_id, period_type, discharge_type, special_type); create index power_station_discharge_index on power_station_discharge (date, replace_type, station_id, period_type, discharge_type, special_type);
查询计划
txtINSERT INTO "MY_TABLE"("QUERY PLAN") VALUES ('Subquery Scan on ranked_records (cost=40916.33..70403.65 rows=185494 width=129)'); INSERT INTO "MY_TABLE"("QUERY PLAN") VALUES (' -> WindowAgg (cost=40916.33..68548.71 rows=185494 width=133)'); INSERT INTO "MY_TABLE"("QUERY PLAN") VALUES (' -> Gather Merge (cost=40916.33..62520.16 rows=185494 width=125)'); INSERT INTO "MY_TABLE"("QUERY PLAN") VALUES (' Workers Planned: 2'); INSERT INTO "MY_TABLE"("QUERY PLAN") VALUES (' -> Sort (cost=39916.30..40109.52 rows=77289 width=125)'); INSERT INTO "MY_TABLE"("QUERY PLAN") VALUES (' Sort Key: power_station_discharge.date, power_station_discharge.station_id, power_station_discharge.period_type, power_station_discharge.discharge_type, power_station_discharge.special_type, (CASE WHEN ((power_station_discharge.source_type)::text = ''REPORTED''::text) THEN 0 ELSE 1 END)'); INSERT INTO "MY_TABLE"("QUERY PLAN") VALUES (' -> Parallel Seq Scan on power_station_discharge (cost=0.81..28618.72 rows=77289 width=125)'); INSERT INTO "MY_TABLE"("QUERY PLAN") VALUES (' Filter: ((date >= ''2025-01-01''::date) AND (date <= ''2025-12-01''::date) AND ((source_type)::text = ''AUDIT''::text) AND ((replace_type)::text = ''POWER_REPLACE_DEFAULT''::text) AND (station_id = ANY (''{200000053,200000054,200000032,102000530,200000100,102000528,200000091,200000074,200000058,200000062,102000529,102000511,200000078,200000089,200000038,102000406,200000098,103000001,102000205,200000073,200000069,102000287,102000446,200000057,200000029,200000061,200000033,200000036,200000059,102000365,102000411,102000377,102000427,102000413,102000412,102000515,200000044,102000501,200000067,200000048,102000517,102000430,102000410,102000032,102000072,102000009,102000342,102000418,200000019,102000527,102000354,102000395,102000070,200000011,102000518,200000097,102000426,102000328,200000021,102000521,200000027,102000439,102000381,102000386,102000066,200000092,200000042,102000349,200000034,200000047,200000016,102000436,102000265,102000306,102000245,200000080,200000077,200000099,102000061,102000356,102000048,102000041,102000015,102000012,102000034,102000053,102000044,102000038,102000429,102000433,102000417,102000382,102000371,102000010,200000093,200000084,200000088,102000525,200000082,200000087,102000357,102000325,200000085,102000369,102000415,102000401,102000403,102000392,102000344,102000337,102000347,102000285,102000071,102000185,102000361,102000341,102000397,102000396,102000513,200000094,102000326,102000065,102000389,102000419,102000368,102000444,102000425,102000037,102000423,102000374,102000067,102000327,200000028,200000066,200000043,102000391,102000384,200000086,102000526,102000348,102000058,102000338,102000504,102000040,102000054,102000400,102000335,102000435,102000432,102000522,102000339,102000051,102000330,102000398,102000027,102000393,102000405,102000409,102000443,102000045,102000524,102000503,102000043,200000030,200000046,102000532,102000307,102000380,102000506,102000226,102000388,102000364,102000376,102000308,102000366,102000028,102000362,200000000,102000363,102000343,102000016,102000031,102000060,102000390,102000014,102000422,102000519,102000228,102000447,102000428,102000360,102000523,200000060,102000373,102000068,102000370,102000029,102000346,102000375,102000505,102000047,102000445,102000500,102000510,102000502,102000207,102000206,200000083,200000075,102000056,102000050,200000023,102000437,102000358,102000011,102000340,102000394,102000448,200000020,102000449,200000018,200000095,200000090,200000096,102000186,102000017,102000052,102000036,200000026,102000059,102000030,102000531,200000015,200000051,200000039,200000065,200000025,102000055,102000512,102000230,102000026,102000441,102000355,200000064,200000031,200000072,200000040,102000046,102000062,102000404,102000534,102000520,102000064,102000305,102000345,102000509,102000351,102000399,200000063,200000056,200000070,102000416,200000035,200000071,200000050,200000055,200000041,200000037,200000045,102000378,102000334,102000286,102000379,102000333,102000533,102000350,102000033,102000013,102000063,102000408,102000431,102000372,102000442,102000438,102000069,200000017,200000049,200000079,200000022,102000508,102000516,102000266,102000329,102000352,102000359,102000042,102000336,102000331,102000074,102000252,200000052,200000076,102000421,102000420,102000414,102000383,102000387,102000424,102000367,200000024,102000035,102000402,102000507,102000434,102000229,102000008,102000049,102000407,102000353,102000057,102000385,102000073,200000081}''::bigint[])))');
先把执行计划“翻译”一下,你这条 SQL 的主要代价都花在 扫表 + 为 WindowAgg 排序 上。
计划关键链路是:
Parallel Seq Scan on power_station_discharge(并行顺序扫表)
说明优化器判断:用现有索引不划算(或统计信息认为命中比例较高),所以直接把表的大段数据读出来再过滤。
这里估算过滤后仍有 ~185k 行(rows=185494)。
Sort(每个 worker 排序) + Gather Merge(合并)
WindowAgg 计算 row_number() 必须要求输入按 partition key + order key 有序,所以出现了大排序:
Sort Key: date, station_id, period_type, discharge_type, special_type, CASE...
WindowAgg
本身计算 row_number 的 CPU 也有,但通常不如“排序 + I/O”贵;你这里成本上也能看出来:Sort 段的 cost 从 28618 拉到 39916,再到 Gather Merge 62520,WindowAgg 68548。
👉 结论:瓶颈主要是并行顺序扫表带来的 I/O + 为 WindowAgg 做的全量排序(可能还会 spill 到磁盘)。
你 WHERE 里写死了:
sqlAND source_type = 'AUDIT'
同时表上有唯一索引:
sql(date, source_type, replace_type, station_id, period_type, discharge_type, special_type) UNIQUE
这意味着:在 source_type 固定为 AUDIT 的子集内,同一组
(date, replace_type, station_id, period_type, discharge_type, special_type)
最多只有 1 行(因为 source_type 已固定,唯一性就落在其余列上)。
所以你现在的:
sqlROW_NUMBER() OVER (PARTITION BY date, replace_type, station_id, period_type, discharge_type, special_type ...)
在当前查询条件下几乎等价于:每行 rn 都是 1。
也就是说:你为了算一个恒等于 1 的列,付出了“全量排序 + WindowAgg”的代价。
✅ 如果你不需要 rn(你外层也没用 rn 过滤),直接删掉窗口:
sqlSELECT * FROM power_station_discharge WHERE date BETWEEN '2025-01-01' AND '2025-12-01' AND source_type = 'AUDIT' AND replace_type = 'POWER_REPLACE_DEFAULT' AND station_id = ANY('{...}'::bigint[]);
如果你只是为了带个 rn 字段给前端/调试,用常量即可:
sqlSELECT *, 1 AS rn FROM power_station_discharge WHERE ...
这一步通常就是“数量级”的优化,因为直接消灭了 Sort + WindowAgg。
你现有两个索引:
(date, source_type, replace_type, station_id, period_type, discharge_type, special_type)(date, replace_type, station_id, period_type, discharge_type, special_type)但你的过滤条件是 date 范围(很大,接近一年)+ source_type/replace_type 等值 + station_id 大 IN 列表。
A. 如果你保留 AUDIT + POWER_REPLACE_DEFAULT 这种固定条件很常见
建一个**部分索引(partial index)**最划算,体积小、命中准:
sqlCREATE INDEX CONCURRENTLY psd_audit_default_idx ON power_station_discharge (date, station_id, period_type, discharge_type, special_type) WHERE source_type = 'AUDIT' AND replace_type = 'POWER_REPLACE_DEFAULT';
这样能显著降低扫表范围;并且键顺序也贴近你计划里的排序键(replace_type 固定后它本来也不在 sort key 里)。
B. 如果 source_type、replace_type 经常作为等值过滤,date 是范围
把等值列放前面通常更好:
sqlCREATE INDEX CONCURRENTLY psd_st_rt_date_station_idx ON power_station_discharge (source_type, replace_type, date, station_id, period_type, discharge_type, special_type);
这类索引更容易把扫描范围限制在 (AUDIT, POWER_REPLACE_DEFAULT) 这一个“子空间”内,再做 date 范围。
注意:索引不是越多越好,优先选最贴合你线上最常见的谓词组合的一条。
sqlWITH s AS (SELECT unnest($1::bigint[]) AS station_id) SELECT p.* FROM power_station_discharge p JOIN s USING (station_id) WHERE ...
只取需要的列,别 SELECT *
排序/传输的行宽是 125B+,列越多排序越吃内存、越容易 spill。
用 EXPLAIN (ANALYZE, BUFFERS) 确认是否“Sort spill”
如果看到 Sort Method: external merge 或磁盘读写很高,那提高该会话 work_mem(针对这条查询)可能立竿见影;但它解决不了“窗口函数本可删”的根因。
按 date 分区(按月/季度)
如果表很大且所有查询都带 date 范围,分区能让扫描范围直接变小(分区裁剪)。
如果你把 EXPLAIN (ANALYZE, BUFFERS) 的真实执行结果也贴一下(尤其是 Sort Method / shared hit/read),我可以再把“到底是 I/O 还是 sort spill”定得更准,并给出更具体的索引列顺序建议。