Arrow Compute

# 基本材料
In March of 2021, when major work on the C++ query execution machinery
in Arrow was beginning, Wes sent a message [1] to the dev list and
linked to a doc [2] with some details about the planned design. A few
months later Neal sent an update [3] about this work. However those
documents are now somewhat out of date. More recently, Wes shared
another update [4] and linked to a doc [5] regarding task execution /
control flow / scheduling. However I think the best source of
information is the doc you linked to. The query execution work has
proceeded organically with many contributors, and efforts to document
the overall design in sufficient detail have not kept pace.
[1] https://lists.apache.org/thread/n632pmjnb85o49lyxy45f7sgh4cshoc0
[2] https://docs.google.com/document/d/1AyTdLU-RxA-Gsb9EsYnrQrmqPMOYMfPlWwxRi1Is1tQ/
[3] https://lists.apache.org/thread/3pmb592zmonz86nmmbjcw08j5tcrfzm1
[4] https://lists.apache.org/thread/ltllzpt1r2ch06mv1ngfgdl7wv2tm8xc
[5] https://docs.google.com/document/d/1216CUQZ7u4acZvC2jX7juqqQCXtdXMellk3lRrgP_WY/
[6] https://conbench.ursa.dev/
[7] https://lists.apache.org/thread/7v7vkc005v9343n49b3shvrdn19wdpj1
# 执行模型
- (some query engines use a "pull"-based model, in which the data flow is inverted — there are pros and cons to both approaches, see [Everything You Always Wanted to Know About Compiled and Vectorized Queries But Were Afraid to Ask](https://www.vldb.org/pvldb/vol11/p2209-kersten.pdf))
- "My personal feeling is that the pull model was good in the early query execution engines, based on processing of a single row at a time and using virtual function calls to switch between relational operators within the query. In my experience, the push model is easier to work with in both modern worlds of query execution: JIT compiled query processing and vectorized query processing." - https://github.com/apache/arrow/pull/9621



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Arrow Compute #9

基本材料

执行模型

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Arrow Compute #9

Description

基本材料

执行模型

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions