[EPIC] JIT support for `DataFusion`

**Summary**
TLDR: The key focus of this work is to speed up fundamentally row oriented operations like hash table lookup or comparisons (e.g. [#2427](https://github.com/apache/arrow-datafusion/issues/2427))

**Background**

DataFusion, like many Arrow systems, is a classic "vectorized computation engine" which works quite well for many common operations. The following paper, gives a good treatment on the various tradeoffs between vectorized and JIT's compilation of query plans: https://db.in.tum.de/~kersten/vectorization_vs_compilation.pdf?lang=de

As mentioned in the paper, there are some fundamentally "row oriented" operations in a database that are not typically amenable to vectorization. The "classics" are: Hash table updates in Joins and Hash Aggregates, as well as comparing tuples in sort.

Another example can be found in [these slides](https://github.com/apache/arrow-datafusion/files/8843957/Expression_evaluation.pdf) from [this presentation](https://docs.google.com/presentation/d/1owNlmpNpC2-eBd-jEYRCt0L8_sW4PWYx70gqnFC0twM/edit#slide=id.gba4e8a221c_0_125)

@yjshen added initial support for JIT'ing in https://github.com/apache/arrow-datafusion/pull/1849 and it currently lives in https://github.com/apache/arrow-datafusion/tree/master/datafusion/jit. He also added partial  support for aggregates in https://github.com/apache/arrow-datafusion/pull/2375 

This ticket aims to be a central location for tracking the status of JIT compiling expressions for anyone who wants to contribute to this effort

**Describe the solution you'd like**
- [ ] 
- [ ] https://github.com/apache/arrow-datafusion/pull/2587
- [x] https://github.com/apache/arrow-datafusion/issues/2122

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[EPIC] JIT support for `DataFusion` #2703

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[EPIC] JIT support for DataFusion #2703

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[EPIC] JIT support for `DataFusion` #2703