Skip to content

Latest commit

 

History

History
292 lines (210 loc) · 13.1 KB

vector-search-functions-and-operators.md

File metadata and controls

292 lines (210 loc) · 13.1 KB
title summary
Vector Functions and Operators
Learn about functions and operators available for Vector data types.

Vector Functions and Operators

This document lists the functions and operators available for Vector data types.

Warning:

This feature is experimental. It is not recommended that you use it in the production environment. This feature might be changed without prior notice. If you find a bug, you can report an issue on GitHub.

Note:

Vector data types and these vector functions are only available for TiDB Self-Managed clusters and TiDB Cloud Serverless clusters.

Vector functions

The following functions are designed specifically for Vector data types.

Vector distance functions:

Function Name Description
VEC_L2_DISTANCE Calculates L2 distance (Euclidean distance) between two vectors
VEC_COSINE_DISTANCE Calculates the cosine distance between two vectors
VEC_NEGATIVE_INNER_PRODUCT Calculates the negative of the inner product between two vectors
VEC_L1_DISTANCE Calculates L1 distance (Manhattan distance) between two vectors

Other vector functions:

Function Name Description
VEC_DIMS Returns the dimension of a vector
VEC_L2_NORM Calculates the L2 norm (Euclidean norm) of a vector
VEC_FROM_TEXT Converts a string into a vector
VEC_AS_TEXT Converts a vector into a string

Extended built-in functions and operators

The following built-in functions and operators are extended to support operations on Vector data types.

Arithmetic operators:

Name Description
+ Vector element-wise addition operator
- Vector element-wise subtraction operator

For more information about how vector arithmetic works, see Vector Data Type | Arithmetic.

Aggregate (GROUP BY) functions:

Name Description
COUNT() Return a count of the number of rows returned
COUNT(DISTINCT) Return the count of a number of different values
MAX() Return the maximum value
MIN() Return the minimum value

Comparison functions and operators:

Name Description
BETWEEN ... AND ... Check whether a value is within a range of values
COALESCE() Return the first non-NULL argument
= Equal operator
<=> NULL-safe equal to operator
> Greater than operator
>= Greater than or equal operator
GREATEST() Return the largest argument
IN() Check whether a value is within a set of values
IS NULL Test whether a value is NULL
ISNULL() Test whether the argument is NULL
LEAST() Return the smallest argument
< Less than operator
<= Less than or equal operator
NOT BETWEEN ... AND ... Check whether a value is not within a range of values
!=, <> Not equal operator
NOT IN() Check whether a value is not within a set of values

For more information about how vectors are compared, see Vector Data Type | Comparison.

Control flow functions:

Name Description
CASE Case operator
IF() If/else construct
IFNULL() Null if/else construct
NULLIF() Return NULL if expr1 = expr2

Cast functions:

Name Description
CAST() Cast a value as a string or vector
CONVERT() Cast a value as a string

For more information about how to use CAST(), see Vector Data Type | Cast.

Full references

VEC_L2_DISTANCE

VEC_L2_DISTANCE(vector1, vector2)

Calculates the L2 distance (Euclidean distance) between two vectors using the following formula:

$DISTANCE(p,q)=\sqrt {\sum \limits {i=1}^{n}{(p{i}-q_{i})^{2}}}$

The two vectors must have the same dimension. Otherwise, an error is returned.

Example:

[tidb]> SELECT VEC_L2_DISTANCE('[0,3]', '[4,0]');
+-----------------------------------+
| VEC_L2_DISTANCE('[0,3]', '[4,0]') |
+-----------------------------------+
|                                 5 |
+-----------------------------------+

VEC_COSINE_DISTANCE

VEC_COSINE_DISTANCE(vector1, vector2)

Calculates the cosine distance between two vectors using the following formula:

$DISTANCE(p,q)=1.0 - {\frac {\sum \limits {i=1}^{n}{p{i}q_{i}}}{{\sqrt {\sum \limits {i=1}^{n}{p{i}^{2}}}}\cdot {\sqrt {\sum \limits {i=1}^{n}{q{i}^{2}}}}}}$

The two vectors must have the same dimension. Otherwise, an error is returned.

Example:

[tidb]> SELECT VEC_COSINE_DISTANCE('[1, 1]', '[-1, -1]');
+-------------------------------------------+
| VEC_COSINE_DISTANCE('[1, 1]', '[-1, -1]') |
+-------------------------------------------+
|                                         2 |
+-------------------------------------------+

VEC_NEGATIVE_INNER_PRODUCT

VEC_NEGATIVE_INNER_PRODUCT(vector1, vector2)

Calculates the distance by using the negative of the inner product between two vectors, using the following formula:

$DISTANCE(p,q)=- INNER_PROD(p,q)=-\sum \limits {i=1}^{n}{p{i}q_{i}}$

The two vectors must have the same dimension. Otherwise, an error is returned.

Example:

[tidb]> SELECT VEC_NEGATIVE_INNER_PRODUCT('[1,2]', '[3,4]');
+----------------------------------------------+
| VEC_NEGATIVE_INNER_PRODUCT('[1,2]', '[3,4]') |
+----------------------------------------------+
|                                          -11 |
+----------------------------------------------+

VEC_L1_DISTANCE

VEC_L1_DISTANCE(vector1, vector2)

Calculates the L1 distance (Manhattan distance) between two vectors using the following formula:

$DISTANCE(p,q)=\sum \limits {i=1}^{n}{|p{i}-q_{i}|}$

The two vectors must have the same dimension. Otherwise, an error is returned.

Example:

[tidb]> SELECT VEC_L1_DISTANCE('[0,0]', '[3,4]');
+-----------------------------------+
| VEC_L1_DISTANCE('[0,0]', '[3,4]') |
+-----------------------------------+
|                                 7 |
+-----------------------------------+

VEC_DIMS

VEC_DIMS(vector)

Returns the dimension of a vector.

Examples:

[tidb]> SELECT VEC_DIMS('[1,2,3]');
+---------------------+
| VEC_DIMS('[1,2,3]') |
+---------------------+
|                   3 |
+---------------------+

[tidb]> SELECT VEC_DIMS('[]');
+----------------+
| VEC_DIMS('[]') |
+----------------+
|              0 |
+----------------+

VEC_L2_NORM

VEC_L2_NORM(vector)

Calculates the L2 norm (Euclidean norm) of a vector using the following formula:

$NORM(p)=\sqrt {\sum \limits {i=1}^{n}{p{i}^{2}}}$

Example:

[tidb]> SELECT VEC_L2_NORM('[3,4]');
+----------------------+
| VEC_L2_NORM('[3,4]') |
+----------------------+
|                    5 |
+----------------------+

VEC_FROM_TEXT

VEC_FROM_TEXT(string)

Converts a string into a vector.

Example:

[tidb]> SELECT VEC_FROM_TEXT('[1,2]') + VEC_FROM_TEXT('[3,4]');
+-------------------------------------------------+
| VEC_FROM_TEXT('[1,2]') + VEC_FROM_TEXT('[3,4]') |
+-------------------------------------------------+
| [4,6]                                           |
+-------------------------------------------------+

VEC_AS_TEXT

VEC_AS_TEXT(vector)

Converts a vector into a string.

Example:

[tidb]> SELECT VEC_AS_TEXT('[1.000,   2.5]');
+-------------------------------+
| VEC_AS_TEXT('[1.000,   2.5]') |
+-------------------------------+
| [1,2.5]                       |
+-------------------------------+

MySQL compatibility

The vector functions and the extended usage of built-in functions and operators over vector data types are TiDB specific, and are not supported in MySQL.

See also