Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: eagerly compute pruning stats during compression #1252

Merged
merged 1 commit into from
Nov 8, 2024

Conversation

lwwmanning
Copy link
Member

follow up to #1236

@lwwmanning lwwmanning added the benchmark Run benchmarks on this branch label Nov 7, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Nov 8, 2024
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vortex bytes_at

Benchmark suite Current: 31513b9 Previous: c1cee33 Ratio
bytes_at/array_data 751.8652911676717 ns (1.3279552215167314) 775.6118839714973 ns (0.9937622936375874) 0.97
bytes_at/array_view 555.9455177032874 ns (0.8903047974681613) 543.4684647416859 ns (0.8252923547032651) 1.02

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataFusion

Benchmark suite Current: 31513b9 Previous: c1cee33 Ratio
arrow/planning 774535.2969350134 ns (1256.1212944823783) 764650.3614937436 ns (1304.172342669277) 1.01
arrow/exec 1368238.6903050137 ns (5824.477684667683) 1335917.2935235286 ns (5271.0427332231775) 1.02
vortex-pushdown-compressed/planning 476619.7119328526 ns (2095.4447334305733) 472785.5267447528 ns (1910.9006836801418) 1.01
vortex-pushdown-compressed/exec 2528927.8170000003 ns (10969.574624999426) 2500102.6895000003 ns (11231.593043750385) 1.01
vortex-pushdown-uncompressed/planning 474508.41588625667 ns (1982.231791684986) 474078.5910658773 ns (880.867603579798) 1.00
vortex-pushdown-uncompressed/exec 2894398.2149999994 ns (4577.093993055401) 2907701.167222223 ns (5542.9613958320115) 1.00
vortex-nopushdown-compressed/planning 782295.3736920472 ns (1665.622723953682) 782596.969157242 ns (1407.7445595603203) 1.00
vortex-nopushdown-compressed/exec 2982558.132941177 ns (20995.097558822716) 2934958.965 ns (20906.028979166644) 1.02
vortex-nopushdown-uncompressed/planning 782847.6706828598 ns (1727.7846746249706) 778858.2626122492 ns (1572.181782282947) 1.01
vortex-nopushdown-uncompressed/exec 4558269.16090909 ns (12688.94203409087) 4466286.363333335 ns (9392.727322918363) 1.02

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random Access

Benchmark suite Current: 31513b9 Previous: c1cee33 Ratio
random-access/vortex-tokio-local-disk 17853994.7 ns (56577.68491666019) 18351892.22333333 ns (74755.44233332947) 0.97
random-access/vortex-local-fs 19358382.823333334 ns (45611.47883333638) 19546286.75999999 ns (55486.97500000149) 0.99
random-access/parquet-tokio-local-disk 222758299.7 ns (2896149.977916658) 213564154.5 ns (3185410.886250019) 1.04

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vortex Compression

Benchmark suite Current: 31513b9 Previous: c1cee33 Ratio
compress time/taxi 1745531046.5 ns (2489174.600000024) 1295362143.3 ns (2166932.399999976) 1.35
compress time/taxi throughput 470808924 bytes 470808924 bytes 1
parquet_rs-zstd compress time/taxi 1785223295 ns (3043911.428749919) 1822095921.3 ns (3020737.4512500763) 0.98
parquet_rs-zstd compress time/taxi throughput 470808924 bytes 470808924 bytes 1
decompress time/taxi 417755254.75 ns (5658751.847499996) 409808158.05 ns (1722490.540625006) 1.02
decompress time/taxi throughput 470808924 bytes 470808924 bytes 1
parquet_rs-zstd decompress time/taxi 337744790.15 ns (897348.2124999762) 308466848.65 ns (427431.53687500954) 1.09
parquet_rs-zstd decompress time/taxi throughput 470808924 bytes 470808924 bytes 1
vortex:parquet-zstd size/taxi 1.0372763367792568 ratio 1.0299843765663863 ratio 1.01
vortex:raw size/taxi 0.12329222544643186 ratio 0.12242495216594493 ratio 1.01
vortex size/taxi 58047080 bytes 57638760 bytes 1.01
compress time/AirlineSentiment 876210.2091746794 ns (1997.4042628205498) 824226.5720406121 ns (877.1890923065948) 1.06
compress time/AirlineSentiment throughput 2020 bytes 2020 bytes 1
parquet_rs-zstd compress time/AirlineSentiment 56732.44055718534 ns (98.37135357587613) 57545.62648714681 ns (229.42584627948236) 0.99
parquet_rs-zstd compress time/AirlineSentiment throughput 2020 bytes 2020 bytes 1
decompress time/AirlineSentiment 44212.42656592516 ns (134.64781213069728) 44286.91494018808 ns (164.4529665099326) 1.00
decompress time/AirlineSentiment throughput 2020 bytes 2020 bytes 1
parquet_rs-zstd decompress time/AirlineSentiment 33648.671502637655 ns (63.01047780147928) 33131.548060163754 ns (106.07892718463336) 1.02
parquet_rs-zstd decompress time/AirlineSentiment throughput 2020 bytes 2020 bytes 1
vortex:parquet-zstd size/AirlineSentiment 9.042399172699069 ratio 9.042399172699069 ratio 1
vortex:raw size/AirlineSentiment 4.328712871287129 ratio 4.328712871287129 ratio 1
vortex size/AirlineSentiment 8744 bytes 8744 bytes 1
compress time/Arade 3598219792.4 ns (5449954.549999952) 2429903296.7 ns (4382143.078749895) 1.48
compress time/Arade throughput 787023760 bytes 787023760 bytes 1
parquet_rs-zstd compress time/Arade 3090081067.7 ns (5429209.90500021) 3099221087.5 ns (5237821.657500029) 1.00
parquet_rs-zstd compress time/Arade throughput 787023760 bytes 787023760 bytes 1
decompress time/Arade 739934913.2 ns (4360929.910000026) 757883709.9 ns (2184501.131250024) 0.98
decompress time/Arade throughput 787023760 bytes 787023760 bytes 1
parquet_rs-zstd decompress time/Arade 723529426.5 ns (3895366.286249995) 649649856.1 ns (2087032.862500012) 1.11
parquet_rs-zstd decompress time/Arade throughput 787023760 bytes 787023760 bytes 1
vortex:parquet-zstd size/Arade 0.5033079295106381 ratio 0.500537677423841 ratio 1.01
vortex:raw size/Arade 0.1953013362646129 ratio 0.19422621751597435 ratio 1.01
vortex size/Arade 153706792 bytes 152860648 bytes 1.01
compress time/Bimbo 16653761368 ns (54449165.24374962) 11793601990.3 ns (13185647.921250343) 1.41
compress time/Bimbo throughput 7121333608 bytes 7121333608 bytes 1
parquet_rs-zstd compress time/Bimbo 22206307630.6 ns (38665666.377500534) 21617435281.9 ns (39866855.163749695) 1.03
parquet_rs-zstd compress time/Bimbo throughput 7121333608 bytes 7121333608 bytes 1
decompress time/Bimbo 4835058232.9 ns (47522830.849999905) 4925098356.9 ns (64536752.628750324) 0.98
decompress time/Bimbo throughput 7121333608 bytes 7121333608 bytes 1
parquet_rs-zstd decompress time/Bimbo 4106230015.4 ns (15817208.003750086) 2645991218.1 ns (9118653.950000048) 1.55
parquet_rs-zstd decompress time/Bimbo throughput 7121333608 bytes 7121333608 bytes 1
vortex:parquet-zstd size/Bimbo 1.3849282819002102 ratio 1.4422883961876998 ratio 0.96
vortex:raw size/Bimbo 0.07548603753040185 ratio 0.07861247215986346 ratio 0.96
vortex size/Bimbo 537561256 bytes 559825640 bytes 0.96
compress time/CMSprovider 18126148716.4 ns (47354529.64999962) 14154839403.5 ns (16396633.357499123) 1.28
compress time/CMSprovider throughput 5149123964 bytes 5149123964 bytes 1
parquet_rs-zstd compress time/CMSprovider 20534232402.2 ns (60181205.49499893) 21216726035.5 ns (20048893.049999237) 0.97
parquet_rs-zstd compress time/CMSprovider throughput 5149123964 bytes 5149123964 bytes 1
decompress time/CMSprovider 6282558464.5 ns (17888337) 6230372137.4 ns (19174193.82499981) 1.01
decompress time/CMSprovider throughput 5149123964 bytes 5149123964 bytes 1
parquet_rs-zstd decompress time/CMSprovider 6014519493.5 ns (29172070.69750023) 6088492492.9 ns (17661748.467499733) 0.99
parquet_rs-zstd decompress time/CMSprovider throughput 5149123964 bytes 5149123964 bytes 1
vortex:parquet-zstd size/CMSprovider 1.250944227267517 ratio 1.2487125445973772 ratio 1.00
vortex:raw size/CMSprovider 0.1869473873090075 ratio 0.18661487870910384 ratio 1.00
vortex size/CMSprovider 962615272 bytes 960903144 bytes 1.00
compress time/Euro2016 3149746758.7 ns (7328379.622499943) 2772666256.1 ns (2803495.6124999523) 1.14
compress time/Euro2016 throughput 393253221 bytes 393253221 bytes 1
parquet_rs-zstd compress time/Euro2016 1591406478.6 ns (4375298.88499999) 1595842111.4 ns (1777385.4812499285) 1.00
parquet_rs-zstd compress time/Euro2016 throughput 393253221 bytes 393253221 bytes 1
decompress time/Euro2016 313786091.7 ns (1872628.7243750095) 319594685.05 ns (1476529.6337500215) 0.98
decompress time/Euro2016 throughput 393253221 bytes 393253221 bytes 1
parquet_rs-zstd decompress time/Euro2016 499698827.7 ns (3398008.832499981) 488895856.55 ns (1295473.9118750095) 1.02
parquet_rs-zstd decompress time/Euro2016 throughput 393253221 bytes 393253221 bytes 1
vortex:parquet-zstd size/Euro2016 1.4512502130753013 ratio 1.4502231114782558 ratio 1.00
vortex:raw size/Euro2016 0.43874779604157393 ratio 0.43843727855950604 ratio 1.00
vortex size/Euro2016 172538984 bytes 172416872 bytes 1.00
compress time/Food 1601089528.1 ns (2953086.517500043) 1214405935.4 ns (3384823.1050001383) 1.32
compress time/Food throughput 332718229 bytes 332718229 bytes 1
parquet_rs-zstd compress time/Food 1121116869.1 ns (1266357.3999999762) 1128812684.1 ns (1992325.8824999332) 0.99
parquet_rs-zstd compress time/Food throughput 332718229 bytes 332718229 bytes 1
decompress time/Food 204330500.23333332 ns (1155471.8820833415) 201872746.8333333 ns (1108520.8050000072) 1.01
decompress time/Food throughput 332718229 bytes 332718229 bytes 1
parquet_rs-zstd decompress time/Food 227671777.6666667 ns (1118027.4295833409) 215266978.3 ns (279072.1008333564) 1.06
parquet_rs-zstd decompress time/Food throughput 332718229 bytes 332718229 bytes 1
vortex:parquet-zstd size/Food 1.315784457488326 ratio 1.3132884053892004 ratio 1.00
vortex:raw size/Food 0.14327705501221577 ratio 0.1430052574606605 ratio 1.00
vortex size/Food 47670888 bytes 47580456 bytes 1.00
compress time/HashTags 3043314601.8 ns (4514786.694999933) 2718792728.4 ns (2054576.100000143) 1.12
compress time/HashTags throughput 804495592 bytes 804495592 bytes 1
parquet_rs-zstd compress time/HashTags 2536192389 ns (5644803.382500172) 2558436635.4 ns (3666519.4937500954) 0.99
parquet_rs-zstd compress time/HashTags throughput 804495592 bytes 804495592 bytes 1
decompress time/HashTags 561668567.4 ns (1915045.7475000024) 567806612.1 ns (3390370.367500007) 0.99
decompress time/HashTags throughput 804495592 bytes 804495592 bytes 1
parquet_rs-zstd decompress time/HashTags 789488391.7 ns (6791373.443750024) 804546286.1 ns (3040899.2037499547) 0.98
parquet_rs-zstd decompress time/HashTags throughput 804495592 bytes 804495592 bytes 1
vortex:parquet-zstd size/HashTags 1.6855648011699251 ratio 1.6873911818698275 ratio 1.00
vortex:raw size/HashTags 0.28068211963552936 ratio 0.2809862505747577 ratio 1.00
vortex size/HashTags 225807528 bytes 226052200 bytes 1.00
compress time/TPC-H l_comment chunked without fsst 4231990185 ns (24612574.023750067) 3774158295.2 ns (7205992.347500086) 1.12
compress time/TPC-H l_comment chunked without fsst throughput 249197090 bytes 249197090 bytes 1
parquet_rs-zstd compress time/TPC-H l_comment chunked without fsst 906790274.8 ns (2290953.57250005) 917289782.4 ns (1653678.4399999976) 0.99
parquet_rs-zstd compress time/TPC-H l_comment chunked without fsst throughput 249197090 bytes 249197090 bytes 1
decompress time/TPC-H l_comment chunked without fsst 110496651.95011905 ns (644039.732416682) 109687119.71039684 ns (955723.5676795617) 1.01
decompress time/TPC-H l_comment chunked without fsst throughput 249197090 bytes 249197090 bytes 1
parquet_rs-zstd decompress time/TPC-H l_comment chunked without fsst 251644853.75 ns (596100.075000003) 251855072.3 ns (859853.924999997) 1.00
parquet_rs-zstd decompress time/TPC-H l_comment chunked without fsst throughput 249197090 bytes 249197090 bytes 1
vortex:parquet-zstd size/TPC-H l_comment chunked without fsst 4.609519593091686 ratio 4.6098764117128965 ratio 1.00
vortex:raw size/TPC-H l_comment chunked without fsst 1.0532141286240542 ratio 1.053185107418389 ratio 1.00
vortex size/TPC-H l_comment chunked without fsst 262457896 bytes 262450664 bytes 1.00
compress time/TPC-H l_comment chunked 1340328374.6 ns (5812067.313750029) 898236917.7 ns (1253379.8500000238) 1.49
compress time/TPC-H l_comment chunked throughput 249197090 bytes 249197090 bytes 1
parquet_rs-zstd compress time/TPC-H l_comment chunked 903752987.3 ns (3310495.918749988) 914741113.6 ns (1860149.264999926) 0.99
parquet_rs-zstd compress time/TPC-H l_comment chunked throughput 249197090 bytes 249197090 bytes 1
decompress time/TPC-H l_comment chunked 120926946.83345239 ns (437042.4976666719) 121719296.22396827 ns (592369.3443978205) 0.99
decompress time/TPC-H l_comment chunked throughput 249197090 bytes 249197090 bytes 1
parquet_rs-zstd decompress time/TPC-H l_comment chunked 250784332.1 ns (1385461.800000012) 252565363.4 ns (1131850.075000003) 0.99
parquet_rs-zstd decompress time/TPC-H l_comment chunked throughput 249197090 bytes 249197090 bytes 1
vortex:parquet-zstd size/TPC-H l_comment chunked 1.3523276240821809 ratio 1.3522132690473394 ratio 1.00
vortex:raw size/TPC-H l_comment chunked 0.3089889372303665 ratio 0.3089303811693788 ratio 1.00
vortex size/TPC-H l_comment chunked 76999144 bytes 76984552 bytes 1.00
compress time/TPC-H l_comment canonical 1341216913.5 ns (1761867.0193749666) 894183675.5 ns (867472.6331250072) 1.50
compress time/TPC-H l_comment canonical throughput 249197106 bytes 249197106 bytes 1
parquet_rs-zstd compress time/TPC-H l_comment canonical 908882137.25 ns (4118742.708124995) 921470197 ns (1682253.2881250381) 0.99
parquet_rs-zstd compress time/TPC-H l_comment canonical throughput 249197106 bytes 249197106 bytes 1
decompress time/TPC-H l_comment canonical 119914879.17624338 ns (465950.16332012415) 121655248.20451057 ns (403593.7212584391) 0.99
decompress time/TPC-H l_comment canonical throughput 249197106 bytes 249197106 bytes 1
parquet_rs-zstd decompress time/TPC-H l_comment canonical 248005809.70843259 ns (442156.86083334684) 251847463.52624997 ns (897719.0429166853) 0.98
parquet_rs-zstd decompress time/TPC-H l_comment canonical throughput 249197106 bytes 249197106 bytes 1
vortex:parquet-zstd size/TPC-H l_comment canonical 1.352336364427018 ratio 1.352209373850611 ratio 1.00
vortex:raw size/TPC-H l_comment canonical 0.3089889173913601 ratio 0.30893036133413204 ratio 1.00
vortex size/TPC-H l_comment canonical 76999144 bytes 76984552 bytes 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TPC-H

Benchmark suite Current: 31513b9 Previous: c1cee33 Ratio
tpch_q1/vortex-in-memory-no-pushdown 569252559.2 ns (1908177.2524999976) 574361578.1 ns (5622885.751250029) 0.99
tpch_q1/vortex-in-memory-pushdown 456976591.1 ns (1155251.5350000262) 459968376.5 ns (2020408.1499999762) 0.99
tpch_q1/arrow 551693907.5 ns (2987368.699999988) 563784254 ns (3500541.707499981) 0.98
tpch_q1/parquet 681544631.9 ns (2068482.176249981) 704917910.8 ns (2165428.9162499905) 0.97
tpch_q1/vortex-file-compressed 499517228.45 ns (2506704.849999994) 506475189.1 ns (3576931.401250005) 0.99
tpch_q1/vortex-file-uncompressed 534027200.8 ns (3166872.399999976) 542694070.1 ns (2846584.5950000286) 0.98
tpch_q2/vortex-in-memory-no-pushdown 122281316.53611112 ns (607309.8182638958) 121659389.99611111 ns (840562.5608333349) 1.01
tpch_q2/vortex-in-memory-pushdown 121043614.95559523 ns (936715.2001711205) 119727465.60218254 ns (632676.2686111033) 1.01
tpch_q2/arrow 118936803.0809524 ns (829720.4575595111) 119706675.13956349 ns (1062046.824217759) 0.99
tpch_q2/parquet 151111760.22539684 ns (668142.8923015893) 148885088.5601984 ns (1233637.8701497912) 1.01
tpch_q2/vortex-file-compressed 170416665.28214285 ns (717123.7554166317) 168934413.66555554 ns (3156637.121236101) 1.01
tpch_q2/vortex-file-uncompressed 172491017.76543647 ns (1207837.0901686102) 166282442.21039683 ns (1260544.6573809534) 1.04
tpch_q3/vortex-in-memory-no-pushdown 160440572.42011905 ns (1069617.1868660748) 160935109.1389286 ns (1025473.2068154514) 1.00
tpch_q3/vortex-in-memory-pushdown 181797146.57730156 ns (486776.29599802196) 182020502.53333336 ns (1500014.8666666597) 1.00
tpch_q3/arrow 150865327.26869047 ns (837692.2068005949) 149581345.32297617 ns (758954.842392847) 1.01
tpch_q3/parquet 339214792.3 ns (1541616.537499994) 339499678.35 ns (2151097.020624995) 1.00
tpch_q3/vortex-file-compressed 287088875.4 ns (1147002.0387500226) 288143112.45 ns (1284942.5150000155) 1.00
tpch_q3/vortex-file-uncompressed 264964945.25 ns (1512679.9143749923) 265749740.65 ns (1108326.7493750006) 1.00
tpch_q4/vortex-in-memory-no-pushdown 112593642.78765874 ns (467070.4751001969) 112512766.10063493 ns (365043.8728551492) 1.00
tpch_q4/vortex-in-memory-pushdown 131614885.10400793 ns (593277.0347222239) 131687725.1663492 ns (455131.4506349191) 1.00
tpch_q4/arrow 185133968.73333335 ns (869013.0708333105) 184952503.06666666 ns (664425.7166666687) 1.00
tpch_q4/parquet 206748005.53333333 ns (804639.3879166543) 207004328.06666666 ns (999671.7870833278) 1.00
tpch_q4/vortex-file-compressed 241378729.1 ns (1232588.4166666567) 241718447.9 ns (938776.0666666776) 1.00
tpch_q4/vortex-file-uncompressed 242132635.3333333 ns (595184.693749994) 241201233.83333334 ns (1285668.5833333284) 1.00
tpch_q5/vortex-in-memory-no-pushdown 287465691.15 ns (1479891.543749988) 295978497.7 ns (3031694.881250024) 0.97
tpch_q5/vortex-in-memory-pushdown 297549457.25 ns (1817421.8799999952) 289372781.3 ns (1788862.0243749619) 1.03
tpch_q5/arrow 278143276.25 ns (3163279.5) 267714411.95 ns (1554802.5368750244) 1.04
tpch_q5/parquet 451074110.75 ns (2605018.1799999774) 437866959 ns (2577767.7962500155) 1.03
tpch_q5/vortex-file-compressed 374597269.15 ns (3064142.276249975) 377990232.55 ns (2545604.7274999917) 0.99
tpch_q5/vortex-file-uncompressed 359918470.35 ns (2305573.2931250036) 361029988.45 ns (3752365.410624981) 1.00
tpch_q6/vortex-in-memory-no-pushdown 35855022.51201058 ns (253026.55263227224) 35998164.820978835 ns (298991.6685003303) 1.00
tpch_q6/vortex-in-memory-pushdown 73117591.10382935 ns (275439.82165750116) 77151348.0332738 ns (295305.74695685506) 0.95
tpch_q6/arrow 26059058.524117064 ns (201720.40256200172) 25939547.356140874 ns (146538.06640438922) 1.00
tpch_q6/parquet 137304192.33123016 ns (339765.6497742981) 138906406.16015872 ns (428826.2976716161) 0.99
tpch_q6/vortex-file-compressed 21496947.703896824 ns (368164.05908343196) 21139692.745357145 ns (434169.84309880994) 1.02
tpch_q6/vortex-file-uncompressed 248271037.63333336 ns (1017184.6212499887) 249378556.1 ns (426915.3724999726) 1.00
tpch_q7/vortex-in-memory-no-pushdown 564622218.9 ns (5388302.300000012) 552228669.6 ns (7218375.121249974) 1.02
tpch_q7/vortex-in-memory-pushdown 595864809.7 ns (2505168.94750005) 592532197.4 ns (5969838.256249964) 1.01
tpch_q7/arrow 553048134.4 ns (3928175.038749993) 541493343 ns (3615244.023750007) 1.02
tpch_q7/parquet 693581655.7 ns (3657005.7400000095) 667079309.8 ns (3727451.0412499905) 1.04
tpch_q7/vortex-file-compressed 703242848.9 ns (4435938.02124995) 691380754.9 ns (6904383.977500021) 1.02
tpch_q7/vortex-file-uncompressed 696755879 ns (5403891.232499957) 698855510.3 ns (8896098.501250029) 1.00
tpch_q8/vortex-in-memory-no-pushdown 225008107.3333333 ns (1153566.9166666418) 224058723.5333333 ns (1215247.1887500435) 1.00
tpch_q8/vortex-in-memory-pushdown 231293401.9 ns (1777503.893750012) 229571543.36666664 ns (1582875.0737500042) 1.01
tpch_q8/arrow 210057773.46666664 ns (1226841.9708333313) 207849240.03333333 ns (729107.2754166573) 1.01
tpch_q8/parquet 491742119.75 ns (1982989.9675000012) 490924484.6 ns (4300299.681249976) 1.00
tpch_q8/vortex-file-compressed 314379916.1 ns (2701277.950000018) 315710088.35 ns (2391013.9681250155) 1.00
tpch_q8/vortex-file-uncompressed 306785013.55 ns (2616026.223124981) 303766104.65 ns (1714386.7762500048) 1.01
tpch_q9/vortex-in-memory-no-pushdown 430440493.65 ns (2322900.2975000143) 433660361.3 ns (3663016.5881249905) 0.99
tpch_q9/vortex-in-memory-pushdown 434552660.5 ns (3431309.0837500095) 415647445.15 ns (2735231.0068750083) 1.05
tpch_q9/arrow 411017734.4 ns (3078907.399999976) 405977315.25 ns (3310920.1318749785) 1.01
tpch_q9/parquet 708286784.6 ns (4804273.017499983) 706439860.6 ns (6133038.657499969) 1.00
tpch_q9/vortex-file-compressed 499150350.2 ns (3556810.099999994) 505073167.3 ns (6276582.548749983) 0.99
tpch_q9/vortex-file-uncompressed 492764570.7 ns (4219876.644999981) 475427983.35 ns (5717167.263125002) 1.04
tpch_q10/vortex-in-memory-no-pushdown 285945615.9 ns (733932.6024999917) 277425460.75 ns (1317243.4118750095) 1.03
tpch_q10/vortex-in-memory-pushdown 310631688.15 ns (1749388.9281250238) 311973734.05 ns (2075785.7400000095) 1.00
tpch_q10/arrow 271000244.2 ns (1907833.1706250012) 272938001.65 ns (986864.7768750191) 0.99
tpch_q10/parquet 506814169 ns (1318825.650000006) 500908784.2 ns (3076597.6575000286) 1.01
tpch_q10/vortex-file-compressed 432070965 ns (2301499.375) 436540415.35 ns (2996633.395625025) 0.99
tpch_q10/vortex-file-uncompressed 418659059.4 ns (2711654.1399999857) 428627481.3 ns (2454628.625) 0.98
tpch_q11/vortex-in-memory-no-pushdown 179607308.1249603 ns (1026834.3665659875) 183548403.7 ns (925194.5399999917) 0.98
tpch_q11/vortex-in-memory-pushdown 182327467.5920635 ns (846423.6399335265) 180569145.1593651 ns (810408.1831745952) 1.01
tpch_q11/arrow 177468259.5144841 ns (1032390.9861984253) 180027770.02031744 ns (1602949.2837996036) 0.99
tpch_q11/parquet 185628692.9 ns (1075399.3570833206) 186298390.8333333 ns (760659.7333333343) 1.00
tpch_q11/vortex-file-compressed 272192602.2 ns (2060503.107499987) 276707604.7 ns (3756048.083124995) 0.98
tpch_q11/vortex-file-uncompressed 277711675.1 ns (1719847.349999994) 265437908.75 ns (3004114.2162500024) 1.05
tpch_q12/vortex-in-memory-no-pushdown 232759836.03333336 ns (788651.5041666776) 231280566.43333334 ns (416894.46666668355) 1.01
tpch_q12/vortex-in-memory-pushdown 257650345.6 ns (622362.8831250072) 254998240.85 ns (1180738.3681250066) 1.01
tpch_q12/arrow 189447297.56666666 ns (327987.960833326) 187828197.40000004 ns (430588.7025000453) 1.01
tpch_q12/parquet 338843362.9 ns (590011.4499999881) 339626687.95 ns (1730623.5543749928) 1.00
tpch_q12/vortex-file-compressed 451356952.45 ns (1216617.1168750226) 457245040.1 ns (1606463.3406250179) 0.99
tpch_q12/vortex-file-uncompressed 433546676.95 ns (1978813.875) 437019300.4 ns (1497133.6774999797) 0.99
tpch_q13/vortex-in-memory-no-pushdown 186803620.86666667 ns (1274086.8341666907) 173278002.95650792 ns (1095501.4760843366) 1.08
tpch_q13/vortex-in-memory-pushdown 185296298.39999998 ns (2433989.443333328) 181786160.27801588 ns (3767920.4386755973) 1.02
tpch_q13/arrow 180432462.54579365 ns (2786111.0559136868) 174765746.5049603 ns (5323639.53160961) 1.03
tpch_q13/parquet 344414563.45 ns (3558820.578125) 334920347.3 ns (5053259.73999998) 1.03
tpch_q13/vortex-file-compressed 204226188.33333334 ns (1058656.2333333343) 199797977.9333333 ns (1091901.750000015) 1.02
tpch_q13/vortex-file-uncompressed 206524959.0666667 ns (1157895.9666666687) 200606026.6 ns (1719196.1225000024) 1.03
tpch_q14/vortex-in-memory-no-pushdown 46747735.396547616 ns (230161.26035713777) 47860372.74263889 ns (436939.2135763839) 0.98
tpch_q14/vortex-in-memory-pushdown 79327240.76744047 ns (568413.1743749976) 79330758.58674604 ns (367643.1660118997) 1.00
tpch_q14/arrow 37622088.71236773 ns (363284.6837868765) 36020440.593029104 ns (653681.8256329373) 1.04
tpch_q14/parquet 228897546.1 ns (521992.8666666597) 228174754.53333336 ns (746933.5833333433) 1.00
tpch_q14/vortex-file-compressed 127370742.2265873 ns (473154.99803274125) 129726626.0345238 ns (578917.0980952382) 0.98
tpch_q14/vortex-file-uncompressed 143142881.01440474 ns (513786.8183154613) 145648511.60702384 ns (462011.50912053883) 0.98
tpch_q15/vortex-in-memory-no-pushdown 73387896.1665873 ns (360092.80354017764) 74593990.39625 ns (442379.61985415965) 0.98
tpch_q15/vortex-in-memory-pushdown 112382521.76261906 ns (412698.31166666) 111960424.62551586 ns (582171.6678338349) 1.00
tpch_q15/arrow 60365556.53448413 ns (531667.4551795647) 59727038.545079365 ns (972077.3337648809) 1.01
tpch_q15/parquet 302398921.5 ns (994594.6318750083) 303022773 ns (1696577.1206249595) 1.00
tpch_q15/vortex-file-compressed 254647806.95 ns (1026029.6850000024) 252642311.6 ns (1864722.006249994) 1.01
tpch_q15/vortex-file-uncompressed 283011361.25 ns (1218293.6474999785) 286325306.5 ns (1364904.1837500036) 0.99
tpch_q16/vortex-in-memory-no-pushdown 108964131.40940475 ns (277122.76583333313) 106906735.07083334 ns (435355.9698958397) 1.02
tpch_q16/vortex-in-memory-pushdown 120945824.35718255 ns (523613.29338245094) 118493640.3970635 ns (384740.54855655134) 1.02
tpch_q16/arrow 107899463.80178571 ns (432340.4597455263) 106326948.2149603 ns (855174.9965248033) 1.01
tpch_q16/parquet 117432747.46341269 ns (659859.7658958435) 118702630.45321426 ns (408275.56158928573) 0.99
tpch_q16/vortex-file-compressed 134307340.67682537 ns (528757.7740952373) 132279480.72805557 ns (619451.4893263802) 1.02
tpch_q16/vortex-file-uncompressed 133579534.6126984 ns (642371.550573416) 131386755.42321427 ns (657566.4219181389) 1.02
tpch_q17/vortex-in-memory-no-pushdown 586050915 ns (10821464.748750031) 562625959.6 ns (3536085.4137499332) 1.04
tpch_q17/vortex-in-memory-pushdown 666088673.5 ns (9629336.639999986) 648358421 ns (9541726.36499995) 1.03
tpch_q17/arrow 570467600.5 ns (10647766.875) 542208296.9 ns (3441288.1575000286) 1.05
tpch_q17/parquet 648224491.5 ns (3374479.800000012) 634489370.6 ns (4873122.673749983) 1.02
tpch_q17/vortex-file-compressed 637727943.3 ns (5207259.105000019) 620067086 ns (6032196.686249971) 1.03
tpch_q17/vortex-file-uncompressed 622974844.6 ns (5122311.576250017) 610907585.5 ns (4721440.75) 1.02
tpch_q18/vortex-in-memory-no-pushdown 1132408825.8 ns (9835910.1450001) 1099050399.2 ns (17677134.88625002) 1.03
tpch_q18/vortex-in-memory-pushdown 1126791000.9 ns (6160792.598749876) 1118002223.1 ns (19961671.400000095) 1.01
tpch_q18/arrow 1107576986.4 ns (5957920.200000048) 1074333527.3 ns (9292122.199999928) 1.03
tpch_q18/parquet 1266603492.4 ns (10686931.626250029) 1244843080.6 ns (16390404.422499895) 1.02
tpch_q18/vortex-file-compressed 1153108031.6 ns (6842825.995000005) 1170654160.1 ns (12660714.293749928) 0.99
tpch_q18/vortex-file-uncompressed 1111135510.5 ns (7267299.383750081) 1117079941.1 ns (6960608.799999952) 0.99
tpch_q19/vortex-in-memory-no-pushdown 182513410.13333336 ns (486325.16499999166) 183281494.96666667 ns (252045.16916666925) 1.00
tpch_q19/vortex-in-memory-pushdown 252544686.3 ns (582257.0737500042) 262322524.1 ns (290778.85437500477) 0.96
tpch_q19/arrow 165855905.47702378 ns (207376.51890178025) 165468420.94373018 ns (395616.4234136939) 1.00
tpch_q19/parquet 454891493.25 ns (1375424.5831249952) 459746212.9 ns (1338008.6599999964) 0.99
tpch_q19/vortex-file-compressed 446835043.2 ns (1657442.8543750048) 444613783.25 ns (2523681.474999994) 1.00
tpch_q19/vortex-file-uncompressed 411578873.65 ns (3099811.474999994) 423751195.1 ns (2427865.099999994) 0.97
tpch_q20/vortex-in-memory-no-pushdown 254501267.15 ns (2775789.9337500036) 245146419.70000005 ns (839579.0504166633) 1.04
tpch_q20/vortex-in-memory-pushdown 281104763.5 ns (2293936.8149999976) 265786013 ns (1785792.674999997) 1.06
tpch_q20/arrow 250694433.9 ns (2415039.3837499917) 238426390.3666667 ns (4754672.725833356) 1.05
tpch_q20/parquet 367289748.8 ns (2075974.75) 369186545.7 ns (1218413.8618749678) 0.99
tpch_q20/vortex-file-compressed 381105218.2 ns (2091667.4449999928) 379941725.55 ns (4494982.028750002) 1.00
tpch_q20/vortex-file-uncompressed 395735049.5 ns (2116409.974999994) 384154608.5 ns (2534459.175625026) 1.03
tpch_q21/vortex-in-memory-no-pushdown 914851244.2 ns (4203575.792500019) 899542382.5 ns (5328370.699999988) 1.02
tpch_q21/vortex-in-memory-pushdown 934063321.7 ns (2866814.0325000286) 936224434.5 ns (6092560.061249971) 1.00
tpch_q21/arrow 886048755.6 ns (3169576.4775000215) 871368845.8 ns (4011212.1375000477) 1.02
tpch_q21/parquet 1000981677.5 ns (4298167.388750017) 1004554618.1 ns (4418399.949999988) 1.00
tpch_q21/vortex-file-compressed 1195771381.7 ns (5948734.326250076) 1191474317.6 ns (9930319.694999933) 1.00
tpch_q21/vortex-file-uncompressed 1159207229.5 ns (5752711.350000024) 1159036764.5 ns (6485052.5) 1.00
tpch_q22/vortex-in-memory-no-pushdown 77271613.46597221 ns (211486.2487135455) 77333168.21593253 ns (159026.79471205175) 1.00
tpch_q22/vortex-in-memory-pushdown 76546629.31882937 ns (213508.19130828232) 77246183.81472223 ns (277057.03086805344) 0.99
tpch_q22/arrow 75755534.93194444 ns (112995.46679166704) 75597912.80906746 ns (114875.81529761106) 1.00
tpch_q22/parquet 95124769.1845238 ns (300937.2196428552) 93871933.01916666 ns (432988.05366666615) 1.01
tpch_q22/vortex-file-compressed 119398404.13000003 ns (311719.5938749984) 120708992.99452381 ns (885822.8922559619) 0.99
tpch_q22/vortex-file-uncompressed 117412719.05690475 ns (629808.9134672508) 118225931.52329366 ns (460137.30935516953) 0.99

This comment was automatically generated by workflow using github-action-benchmark.

@lwwmanning
Copy link
Member Author

@robert3005 I'm conflicted about whether to merge this now, or perhaps bundle this in with your forthcoming stats pruning?

@robert3005
Copy link
Member

@danking is working on stats pruning. I think you can merge it and work on stats computation improvements.

@lwwmanning
Copy link
Member Author

Here's the compression benchmarks, filtered to compress time & compression ratio.

TLDR: compress time goes up by 10-50% across the board (median of ~30% increase). File size & decompress time are essentially unchanged.

Benchmark suite Current: 31513b9 Previous: c1cee33 Ratio
compress time/taxi 1745531046.5 ns (2489174.600000024) 1295362143.3 ns (2166932.399999976) 1.35
compress time/AirlineSentiment 876210.2091746794 ns (1997.4042628205498) 824226.5720406121 ns (877.1890923065948) 1.06
compress time/Arade 3598219792.4 ns (5449954.549999952) 2429903296.7 ns (4382143.078749895) 1.48
compress time/Bimbo 16653761368 ns (54449165.24374962) 11793601990.3 ns (13185647.921250343) 1.41
compress time/CMSprovider 18126148716.4 ns (47354529.64999962) 14154839403.5 ns (16396633.357499123) 1.28
compress time/Euro2016 3149746758.7 ns (7328379.622499943) 2772666256.1 ns (2803495.6124999523) 1.14
compress time/Food 1601089528.1 ns (2953086.517500043) 1214405935.4 ns (3384823.1050001383) 1.32
compress time/HashTags 3043314601.8 ns (4514786.694999933) 2718792728.4 ns (2054576.100000143) 1.12
compress time/TPC-H l_comment chunked without fsst 4231990185 ns (24612574.023750067) 3774158295.2 ns (7205992.347500086) 1.12
compress time/TPC-H l_comment chunked 1340328374.6 ns (5812067.313750029) 898236917.7 ns (1253379.8500000238) 1.49
compress time/TPC-H l_comment canonical 1341216913.5 ns (1761867.0193749666) 894183675.5 ns (867472.6331250072) 1.50

@lwwmanning lwwmanning merged commit b05207e into develop Nov 8, 2024
16 checks passed
@lwwmanning lwwmanning deleted the wm/force-pruning-stats branch November 8, 2024 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants