|  | 
|  | 1 | +<!--- | 
|  | 2 | +  Licensed to the Apache Software Foundation (ASF) under one | 
|  | 3 | +  or more contributor license agreements.  See the NOTICE file | 
|  | 4 | +  distributed with this work for additional information | 
|  | 5 | +  regarding copyright ownership.  The ASF licenses this file | 
|  | 6 | +  to you under the Apache License, Version 2.0 (the | 
|  | 7 | +  "License"); you may not use this file except in compliance | 
|  | 8 | +  with the License.  You may obtain a copy of the License at | 
|  | 9 | +
 | 
|  | 10 | +    http://www.apache.org/licenses/LICENSE-2.0 | 
|  | 11 | +
 | 
|  | 12 | +  Unless required by applicable law or agreed to in writing, | 
|  | 13 | +  software distributed under the License is distributed on an | 
|  | 14 | +  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | 
|  | 15 | +  KIND, either express or implied.  See the License for the | 
|  | 16 | +  specific language governing permissions and limitations | 
|  | 17 | +  under the License. | 
|  | 18 | +--> | 
|  | 19 | + | 
|  | 20 | +## Apache Arrow C++ Compute Functions | 
|  | 21 | + | 
|  | 22 | +This submodule contains analytical functions that process primarily Arrow | 
|  | 23 | +columnar data; some functions can process scalar or Arrow-based array | 
|  | 24 | +inputs. These are intended for use inside query engines, data frame libraries, | 
|  | 25 | +etc. | 
|  | 26 | + | 
|  | 27 | +Many functions have SQL-like semantics in that they perform elementwise or | 
|  | 28 | +scalar operations on whole arrays at a time. Other functions are not SQL-like | 
|  | 29 | +and compute results that may be a different length or whose results depend on | 
|  | 30 | +the order of the values. | 
|  | 31 | + | 
|  | 32 | +Some basic terminology: | 
|  | 33 | + | 
|  | 34 | +* We use the term "function" to refer to particular general operation that may | 
|  | 35 | +  have many different implementations corresponding to different combinations | 
|  | 36 | +  of types or function behavior options. | 
|  | 37 | +* We call a specific implementation of a function a "kernel". When executing a | 
|  | 38 | +  function on inputs, we must first select a suitable kernel (kernel selection | 
|  | 39 | +  is called "dispatching") corresponding to the value types of the inputs | 
|  | 40 | +* Functions along with their kernel implementations are collected in a | 
|  | 41 | +  "function registry". Given a function name and argument types, we can look up | 
|  | 42 | +  that function and dispatch to a compatible kernel. | 
|  | 43 | + | 
|  | 44 | +Types of functions | 
|  | 45 | + | 
|  | 46 | +* Scalar functions: elementwise functions that perform scalar operations in a | 
|  | 47 | +  vectorized manner. These functions are generally valid for SQL-like | 
|  | 48 | +  context. These are called "scalar" in that the functions executed consider | 
|  | 49 | +  each value in an array independently, and the output array or arrays have the | 
|  | 50 | +  same length as the input arrays. The result for each array cell is generally | 
|  | 51 | +  independent of its position in the array. | 
|  | 52 | +* Vector functions, which produce a result whose output is generally dependent | 
|  | 53 | +  on the entire contents of the input arrays. These functions **are generally | 
|  | 54 | +  not valid** for SQL-like processing because the output size may be different | 
|  | 55 | +  than the input size, and the result may change based on the order of the | 
|  | 56 | +  values in the array. This includes things like array subselection, sorting, | 
|  | 57 | +  hashing, and more. | 
|  | 58 | +* Scalar aggregate functions of which can be used in a SQL-like context | 
0 commit comments