-
Notifications
You must be signed in to change notification settings - Fork 287
Fengttt agg skl #23459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fengttt agg skl #23459
Conversation
PR Compliance Guide 🔍Below is a summary of compliance checks for this PR:
Compliance status legend🟢 - Fully Compliant🟡 - Partial Compliant 🔴 - Not Compliant ⚪ - Requires Further Human Verification 🏷️ - Compliance label |
||||||||||||||||||||||||||
24db55e to
03cfc2d
Compare
| return exec.batchFillArgs(offset, groups, vectors, true) | ||
| } | ||
|
|
||
| for i, grp := range groups { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以先判断 vectors[0].hasNull。弄2个分支,可能对于没有null值的会快一点。
if vectors[0].hasNull{
//当前代码
} else {
//当前代码去掉125行的判断
}
### **User description** ## What type of PR is this? - [ ] API-change - [ ] BUG - [ ] Improvement - [ ] Documentation - [x] Feature - [ ] Test and CI - [x] Code Refactoring ## Which issue(s) this PR fixes: issue matrixorigin#22761 Move agg state from AoS to SoA ## What this PR does / why we need it: This supercedes matrixorigin#23420 ___ ### **PR Type** Enhancement, Bug fix ___ ### **Description** - Refactor aggregation state storage from Array-of-Structures to Structure-of-Arrays (SoA) layout - Implement skiplist-based argument storage for distinct aggregations - Reimplement sum and avg functions using new aggregation framework - Add support for decimal types in sum/avg operations - Fix memory management and spill handling in group operations ___ ### Diagram Walkthrough ```mermaid flowchart LR A["Old AoS State<br/>Generic Type S"] -->|Refactor| B["New SoA State<br/>Vector-based"] B -->|For Distinct| C["Skiplist Arena<br/>Argument Storage"] B -->|For Non-Distinct| D["Direct Vectors<br/>Count + Sum"] E["Sum/Avg Functions"] -->|Use| B E -->|Support| F["Int/Float/Decimal<br/>Types"] ``` <details><summary><h3>File Walkthrough</h3></summary> <table><thead><tr><th></th><th align="left">Relevant files</th></tr></thead><tbody><tr><td><strong>Enhancement</strong></td><td><details><summary>12 files</summary><table> <tr> <td><strong>aggState.go</strong><dd><code>Refactor aggregation state to SoA with skiplist storage</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-e023f6352c901853a62c98c252e34c0f31b14a20d29c620667370cbea5edcf83">+544/-381</a></td> </tr> <tr> <td><strong>sumavg2.go</strong><dd><code>Implement sum and avg using new framework</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-f4961ee7f272980b25855019d81061f270cfaf62eb3b2799ee999660d4ac8956">+487/-0</a> </td> </tr> <tr> <td><strong>count2.go</strong><dd><code>Update count functions for new state layout</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-bbf5f9f87fed8eaa2f14ee010f2ef24c192c566c8a1cc0b11bee222df45b20b3">+37/-46</a> </td> </tr> <tr> <td><strong>types.go</strong><dd><code>Register sum and avg in special aggregations</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-cd5af867783cc19c2659aad9c2b6a24e570619d5e9cb5d6292f46baa6cabb9a2">+4/-0</a> </td> </tr> <tr> <td><strong>register.go</strong><dd><code>Add registration functions for sum and avg</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-040b6aac95df3c18b9f78c29a0deeaf3d8643afff751d411ebd2d4c9fb833960">+12/-0</a> </td> </tr> <tr> <td><strong>encoding.go</strong><dd><code>Add int16/uint16/uint32 read/write helpers</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-cce6203321c48e45c767e40b599732cbd11ff38692bdb5cb385ce3ca27ed51e6">+61/-16</a> </td> </tr> <tr> <td><strong>decimal.go</strong><dd><code>Add decimal conversion helper functions</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-8d5b2690a1f9d8ca0a5ff6c1b48195a2fb10a30db49e3f5f04e528ef2a33a463">+21/-1</a> </td> </tr> <tr> <td><strong>vector.go</strong><dd><code>Add null append and improve vector free handling</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-28ac70822524921592389604c9d3caa2e7c488a52b02d22fe9bdc35f8d808d5f">+10/-1</a> </td> </tr> <tr> <td><strong>ptrlen.go</strong><dd><code>Add Replace and Free methods to PtrLen</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-59a36f9022fac34af758ef88ba394a74fd83f1cada7f4b93a095f1772da81ca8">+34/-0</a> </td> </tr> <tr> <td><strong>unsafe.go</strong><dd><code>Add UnsafeFromBytes helper function</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-f10eaba63e271ae6d287840b81c933ab92104079b0e9ad39a6d6bba9b459dbf3">+4/-0</a> </td> </tr> <tr> <td><strong>list_agg.go</strong><dd><code>Update sum/avg to use new framework functions</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-28b8c798a448a1a44bb9deb400c059ce8ca0a5b0cae0e9e1ee884b21ef14b619">+20/-6</a> </td> </tr> <tr> <td><strong>special.go</strong><dd><code>Add registration wrappers for sum and avg</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-d4f026626caba597d15eec009b379fbb52797eda110ee92c848e14dd48a1e475">+8/-0</a> </td> </tr> </table></details></td></tr><tr><td><strong>Bug fix</strong></td><td><details><summary>2 files</summary><table> <tr> <td><strong>types.go</strong><dd><code>Remove unused int32/uint32 conversion functions</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-960ca3f150354d99e2cd6d2230dfda1bcc5ba8daefdf467b812808250d3b965c">+6/-6</a> </td> </tr> <tr> <td><strong>helper.go</strong><dd><code>Fix spill memory release timing</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-c9d9cd6b008accd27ff56bb6a7b0860d64723c9b6c33f7ebf45f21dd61c8b235">+3/-4</a> </td> </tr> </table></details></td></tr><tr><td><strong>Tests</strong></td><td><details><summary>9 files</summary><table> <tr> <td><strong>mpool_test.go</strong><dd><code>Add test for PtrLen Replace functionality</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-01e2ed668061952031bd9d6a3ca32adcff3aad73e7abf9e7519d7e6992dad197">+26/-0</a> </td> </tr> <tr> <td><strong>count2_test.go</strong><dd><code>Add skiplist validation checks to tests</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-c23d1f8157b76e7c0f781c1dcecc5eddb17524740b00916db35604c9be015406">+4/-1</a> </td> </tr> <tr> <td><strong>time_window.result</strong><dd><code>Update test results for new aggregation behavior</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-42534753fd8e34e93bf23a7c96432fd69ca275ed2bb55d68905da57293247b2a">+23/-9</a> </td> </tr> <tr> <td><strong>window.result</strong><dd><code>Update window function test results</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-9bd21d8cf5814f8541e3b0948988bb345344ede5db13d5fbcdcba89cf0996079">+2/-12</a> </td> </tr> <tr> <td><strong>window.sql</strong><dd><code>Add issue markers to window tests</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-dd6bae38afaeafe5010141e4eceef3889bbccb3a511270a44212c06722586bb5">+3/-1</a> </td> </tr> <tr> <td><strong>boundary_comprehensive.result</strong><dd><code>Update overflow error message format</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-3d1ebd79d008a5273ef28530ebf8d070f63ecbba276628be1bb43e2cf49a5ae5">+2/-2</a> </td> </tr> <tr> <td><strong>func_datetime_utc_timestamp.result</strong><dd><code>Update timestamp calculation test result</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-e2ce94c306d9d720f29e8bffbac5e3bb0b08fdf39c1db61dca881197d9dfeee0">+1/-1</a> </td> </tr> <tr> <td><strong>table_func_metadata_scan.result</strong><dd><code>Update overflow error message format</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-eb18bed9d1adf22a836e1a632f15e1b2b7489bb76b95d4c973142d30998d2faa">+1/-1</a> </td> </tr> <tr> <td><strong>bit.result</strong><dd><code>Update bit type aggregation test results</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-aac004664cce0f0487d273c5f2eeb91620333d201ada4881e230d1be554d01d0">+2/-4</a> </td> </tr> </table></details></td></tr><tr><td><strong>Documentation</strong></td><td><details><summary>1 files</summary><table> <tr> <td><strong>repl.go</strong><dd><code>Update copyright year to 2026</code> </dd></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-e7edbd02d3935c2cb7c378e599ce2a8192db9fe2f07066db49426471fbc85a74">+1/-1</a> </td> </tr> </table></details></td></tr><tr><td><strong>Additional files</strong></td><td><details><summary>9 files</summary><table> <tr> <td><strong>encoding_test.go</strong></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-54e208f98e518e8677c324bd2b45340d0eb22c094f438bb4c46c3e1448427b1f">+0/-9</a> </td> </tr> <tr> <td><strong>reuse.go</strong></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-486fe1d28f24b49a1a1f523ebaf9b96b31e43961fe15c06beaacbb5164842a1a">+0/-75</a> </td> </tr> <tr> <td><strong>avg.go</strong></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-1acef59bd6061fadf590a75425eff732f50bdd4748a99dbe883edb478841d6d2">+0/-275</a> </td> </tr> <tr> <td><strong>size_test.go</strong></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-bdd9d4b504f9ca15a4c78e399bc74f6dee356eb8e7d4ab0c6a369d655d097bce">+0/-15</a> </td> </tr> <tr> <td><strong>sum.go</strong></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-006b3e8b5288b8e8007a552406fb85fb0da54d2104b3a1e064c676600fe5c6ca">+0/-336</a> </td> </tr> <tr> <td><strong>sum_test.go</strong></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-47067e8507b855d6614b2e5acefdfd35adacc24b70738469353783f2ef1b22ba">+0/-605</a> </td> </tr> <tr> <td><strong>types.go</strong></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-d2f2ff6ee6ce156b1285bc12c1958736dcade0f5bd7db761fb78307e4394c369">+0/-4</a> </td> </tr> <tr> <td><strong>sum.go</strong></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-fef97e5322d9e3c3bb796acb710fc88bc043337bec6b9232b73edabc3fb67a80">+0/-118</a> </td> </tr> <tr> <td><strong>sum_test.go</strong></td> <td><a href="https://github.com/matrixorigin/matrixone/pull/23459/files#diff-bb19b329d5ea525787b6029cfde0c26628fb35400dca4203d2b26d0a75582c6a">+0/-48</a> </td> </tr> </table></details></td></tr></tbody></table> </details> ___
User description
What type of PR is this?
Which issue(s) this PR fixes:
issue #22761
Move agg state from AoS to SoA
What this PR does / why we need it:
This supercedes #23420
PR Type
Enhancement, Bug fix
Description
Refactor aggregation state storage from Array-of-Structures to Structure-of-Arrays (SoA) layout
Implement skiplist-based argument storage for distinct aggregations
Reimplement sum and avg functions using new aggregation framework
Add support for decimal types in sum/avg operations
Fix memory management and spill handling in group operations
Diagram Walkthrough
File Walkthrough
12 files
Refactor aggregation state to SoA with skiplist storageImplement sum and avg using new frameworkUpdate count functions for new state layoutRegister sum and avg in special aggregationsAdd registration functions for sum and avgAdd int16/uint16/uint32 read/write helpersAdd decimal conversion helper functionsAdd null append and improve vector free handlingAdd Replace and Free methods to PtrLenAdd UnsafeFromBytes helper functionUpdate sum/avg to use new framework functionsAdd registration wrappers for sum and avg2 files
Remove unused int32/uint32 conversion functionsFix spill memory release timing9 files
Add test for PtrLen Replace functionalityAdd skiplist validation checks to testsUpdate test results for new aggregation behaviorUpdate window function test resultsAdd issue markers to window testsUpdate overflow error message formatUpdate timestamp calculation test resultUpdate overflow error message formatUpdate bit type aggregation test results1 files
Update copyright year to 20269 files