Skip to content

Conversation

@mrhhsg
Copy link
Member

@mrhhsg mrhhsg commented Dec 22, 2025

What problem does this PR solve?

The subcolumns of a pruned complex-type column can fall into two categories:
1. Predicate columns — columns required to evaluate filter predicates, which need to be read upfront.
2. Non-predicate columns — columns that are not needed when evaluating filter predicates.
For non-predicate columns, we can defer reading them until after predicate evaluation, which may significantly reduce the amount of data read.

This PR also removes references to olap/rowset/segment_v2/column_reader.h from many header files, avoiding large-scale recompilation of source files caused by changes to ColumnReader/ ColumnIterator.

Related PR: #xxx

Problem Summary:

Release note

For non-predicate columns, we can defer reading them until after predicate evaluation, which may significantly reduce the amount of data read.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Dec 22, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@mrhhsg
Copy link
Member Author

mrhhsg commented Dec 22, 2025

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.50% (1765/2220)
Line Coverage 65.03% (31062/47763)
Region Coverage 65.53% (15491/23639)
Branch Coverage 56.17% (8239/14668)

@mrhhsg
Copy link
Member Author

mrhhsg commented Dec 23, 2025

run buildall

@hello-stephen
Copy link
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.48% (1766/2222)
Line Coverage 64.80% (31237/48205)
Region Coverage 65.30% (15539/23798)
Branch Coverage 55.99% (8266/14764)

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (1/1) 🎉
Increment coverage report
Complete coverage report

@mrhhsg
Copy link
Member Author

mrhhsg commented Dec 29, 2025

run buildall

@hello-stephen
Copy link
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.52% (1771/2227)
Line Coverage 64.85% (31324/48299)
Region Coverage 65.42% (15591/23831)
Branch Coverage 56.02% (8288/14796)

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 38.84% (127/327) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.37% (18956/35520)
Line Coverage 39.26% (175857/447953)
Region Coverage 33.83% (136113/402320)
Branch Coverage 34.74% (58761/169125)

@mrhhsg
Copy link
Member Author

mrhhsg commented Dec 30, 2025

run buildall

@mrhhsg
Copy link
Member Author

mrhhsg commented Dec 30, 2025

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.38% (1771/2231)
Line Coverage 64.68% (31323/48424)
Region Coverage 65.24% (15584/23887)
Branch Coverage 55.89% (8289/14832)

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉
Increment coverage report
Complete coverage report

@doris-robot
Copy link

TPC-H: Total hot run time: 34393 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 1734d3f540e3ae83d84d4a498a6e5ff594aee288, data reload: false

------ Round 1 ----------------------------------
q1	17660	4235	4069	4069
q2	2021	377	254	254
q3	10179	1305	747	747
q4	10239	925	338	338
q5	7543	2162	1874	1874
q6	188	170	135	135
q7	934	830	686	686
q8	9280	1451	1115	1115
q9	6880	5153	5160	5153
q10	6754	1795	1436	1436
q11	521	323	271	271
q12	686	724	591	591
q13	17805	3808	3090	3090
q14	288	298	278	278
q15	572	507	497	497
q16	704	695	637	637
q17	730	743	640	640
q18	7569	7378	7345	7345
q19	887	970	608	608
q20	407	374	256	256
q21	4167	3922	3411	3411
q22	1081	1017	962	962
Total cold run time: 107095 ms
Total hot run time: 34393 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4084	4035	4022	4022
q2	335	394	318	318
q3	2070	2602	2243	2243
q4	1315	1764	1355	1355
q5	4121	4062	5061	4062
q6	209	174	133	133
q7	2093	1942	1719	1719
q8	2592	2349	2464	2349
q9	7346	7154	7025	7025
q10	2522	2692	2321	2321
q11	573	485	469	469
q12	719	815	641	641
q13	3713	4018	3314	3314
q14	292	307	279	279
q15	563	521	503	503
q16	681	713	645	645
q17	1204	1355	1373	1355
q18	7918	7921	7766	7766
q19	901	873	966	873
q20	1995	2109	1931	1931
q21	4768	4640	4268	4268
q22	1086	1014	1000	1000
Total cold run time: 51100 ms
Total hot run time: 48591 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 174201 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 1734d3f540e3ae83d84d4a498a6e5ff594aee288, data reload: false

query5	4811	607	457	457
query6	347	238	221	221
query7	4252	457	263	263
query8	362	258	241	241
query9	8776	2641	2644	2641
query10	513	378	326	326
query11	15095	14979	14757	14757
query12	192	119	111	111
query13	1258	489	411	411
query14	6500	2985	2725	2725
query14_1	2666	2607	2653	2607
query15	207	191	179	179
query16	997	479	453	453
query17	1071	701	591	591
query18	2695	438	341	341
query19	224	226	199	199
query20	122	119	120	119
query21	217	156	119	119
query22	3870	4005	3778	3778
query23	15915	15862	15371	15371
query23_1	15575	15427	15401	15401
query24	7551	1595	1208	1208
query24_1	1239	1197	1205	1197
query25	580	479	421	421
query26	1249	276	172	172
query27	2727	457	297	297
query28	4494	2193	2197	2193
query29	828	598	429	429
query30	306	237	218	218
query31	807	649	550	550
query32	81	71	69	69
query33	532	327	285	285
query34	889	878	533	533
query35	750	781	720	720
query36	872	877	811	811
query37	139	94	82	82
query38	2667	2699	2655	2655
query39	771	763	728	728
query39_1	705	701	722	701
query40	213	131	116	116
query41	70	64	64	64
query42	102	103	99	99
query43	440	473	383	383
query44	1376	770	767	767
query45	187	183	174	174
query46	875	992	613	613
query47	1413	1442	1283	1283
query48	323	324	261	261
query49	611	420	352	352
query50	647	282	205	205
query51	3784	3963	3788	3788
query52	110	119	100	100
query53	295	325	283	283
query54	279	263	241	241
query55	78	74	76	74
query56	280	293	278	278
query57	1037	1026	938	938
query58	259	257	250	250
query59	2152	2180	2146	2146
query60	315	319	294	294
query61	162	162	157	157
query62	401	371	310	310
query63	301	264	277	264
query64	4963	1293	994	994
query65	3776	3763	3741	3741
query66	1362	428	314	314
query67	15339	14707	15389	14707
query68	4810	1045	734	734
query69	495	337	302	302
query70	1075	979	930	930
query71	365	304	271	271
query72	6353	5150	5040	5040
query73	757	681	324	324
query74	8773	8824	8571	8571
query75	2903	2886	2491	2491
query76	3887	1064	658	658
query77	520	381	274	274
query78	9857	9951	9139	9139
query79	1018	864	610	610
query80	1141	568	492	492
query81	549	260	228	228
query82	405	147	109	109
query83	362	257	246	246
query84	251	117	99	99
query85	913	528	454	454
query86	389	287	321	287
query87	2887	2855	2739	2739
query88	3273	2286	2305	2286
query89	378	359	341	341
query90	1935	163	151	151
query91	173	173	143	143
query92	73	68	64	64
query93	1081	943	564	564
query94	634	326	287	287
query95	593	327	298	298
query96	594	461	217	217
query97	2311	2418	2263	2263
query98	214	199	204	199
query99	595	572	508	508
Total cold run time: 252828 ms
Total hot run time: 174201 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 1734d3f540e3ae83d84d4a498a6e5ff594aee288, data reload: false

query1	0.05	0.04	0.05
query2	0.10	0.04	0.05
query3	0.25	0.08	0.08
query4	1.61	0.11	0.11
query5	0.28	0.26	0.27
query6	1.15	0.67	0.65
query7	0.03	0.02	0.03
query8	0.06	0.04	0.05
query9	0.56	0.50	0.49
query10	0.56	0.54	0.55
query11	0.16	0.11	0.11
query12	0.16	0.13	0.13
query13	0.61	0.60	0.61
query14	0.99	0.98	0.98
query15	0.80	0.79	0.80
query16	0.41	0.40	0.41
query17	1.09	1.05	1.03
query18	0.23	0.21	0.22
query19	1.84	1.87	1.85
query20	0.02	0.02	0.01
query21	15.41	0.26	0.14
query22	5.10	0.05	0.05
query23	15.92	0.30	0.11
query24	0.94	0.70	0.71
query25	0.07	0.08	0.09
query26	0.14	0.14	0.14
query27	0.08	0.08	0.08
query28	5.38	1.07	0.88
query29	12.67	4.01	3.16
query30	0.28	0.13	0.12
query31	2.83	0.65	0.38
query32	3.23	0.56	0.47
query33	2.93	2.99	3.01
query34	16.71	5.14	4.51
query35	4.50	4.92	5.01
query36	0.70	0.55	0.56
query37	0.11	0.07	0.06
query38	0.07	0.05	0.03
query39	0.04	0.03	0.03
query40	0.18	0.14	0.14
query41	0.08	0.03	0.03
query42	0.04	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 98.41 s
Total hot run time: 28 s

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (1/1) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 38.81% (130/335) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.36% (18957/35529)
Line Coverage 39.25% (175956/448305)
Region Coverage 33.80% (136092/402611)
Branch Coverage 34.74% (58816/169288)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 82.39% (276/335) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 72.21% (25080/34734)
Line Coverage 58.96% (263613/447113)
Region Coverage 53.85% (219041/406774)
Branch Coverage 55.38% (94059/169850)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants