Skip to content

Commit

Permalink
Documentation improvements (inc. lengthscale explanation) and Matern1…
Browse files Browse the repository at this point in the history
…2Kernel alias (#213)

* various edits for clarity and typos
* remove reference to not-yet-implemented feature (#38)
* adds Matern12Kernel as alias for ExponentialKernel (in line with the explicitly defined Matern32Kernel and Matern52Kernel) and gives all aliases docstrings
* incorporates the lengthscales explanation from #212.

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
  • Loading branch information
st-- and devmotion authored Jan 9, 2021
1 parent 83a7f5f commit 11008d6
Show file tree
Hide file tree
Showing 13 changed files with 157 additions and 104 deletions.
2 changes: 1 addition & 1 deletion docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ makedocs(
"User Guide" => "userguide.md",
"Examples"=>"example.md",
"Kernel Functions"=>"kernels.md",
"Transform"=>"transform.md",
"Input Transforms"=>"transform.md",
"Metrics"=>"metrics.md",
"Theory"=>"theory.md",
"Custom Kernels"=>"create_kernel.md",
Expand Down
10 changes: 5 additions & 5 deletions docs/src/create_kernel.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

KernelFunctions.jl contains the most popular kernels already but you might want to make your own!

Here are a few ways depending on how complicated your kernel is :
Here are a few ways depending on how complicated your kernel is:

### SimpleKernel for kernels function depending on a metric
### SimpleKernel for kernel functions depending on a metric

If your kernel function is of the form `k(x, y) = f(d(x, y))` where `d(x, y)` is a `PreMetric`,
you can construct your custom kernel by defining `kappa` and `metric` for your kernel.
Expand All @@ -20,15 +20,15 @@ KernelFunctions.metric(::MyKernel) = SqEuclidean()
### Kernel for more complex kernels

If your kernel does not satisfy such a representation, all you need to do is define `(k::MyKernel)(x, y)` and inherit from `Kernel`.
For example we recreate here the `NeuralNetworkKernel`
For example, we recreate here the `NeuralNetworkKernel`:

```julia
struct MyKernel <: KernelFunctions.Kernel end

(::MyKernel)(x, y) = asin(dot(x, y) / sqrt((1 + sum(abs2, x)) * (1 + sum(abs2, y))))
```

Note that `BaseKernel` do not use `Distances.jl` and can therefore be a bit slower.
Note that the fallback implementation of the base `Kernel` evaluation does not use `Distances.jl` and can therefore be a bit slower.

### Additional Options

Expand All @@ -37,7 +37,7 @@ Finally there are additional functions you can define to bring in more features:
- `KernelFunctions.dim(x::MyDataType)`: by default the dimension of the inputs will only be checked for vectors of type `AbstractVector{<:Real}`. If you want to check the dimensionality of your inputs, dispatch the `dim` function on your datatype. Note that `0` is the default.
- `dim` is called within `KernelFunctions.validate_inputs(x::MyDataType, y::MyDataType)`, which can instead be directly overloaded if you want to run special checks for your input types.
- `kernelmatrix(k::MyKernel, ...)`: you can redefine the diverse `kernelmatrix` functions to eventually optimize the computations.
- `Base.print(io::IO, k::MyKernel)`: if you want to specialize the printing of your kernel
- `Base.print(io::IO, k::MyKernel)`: if you want to specialize the printing of your kernel.

KernelFunctions uses [Functors.jl](https://github.com/FluxML/Functors.jl) for specifying trainable kernel parameters
in a way that is compatible with the [Flux ML framework](https://github.com/FluxML/Flux.jl).
Expand Down
2 changes: 1 addition & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# KernelFunctions.jl

Model agnostic kernel functions compatible with automatic differentiation
Model-agnostic kernel functions compatible with automatic differentiation

**KernelFunctions.jl** is a general purpose kernel package.
It aims at providing a flexible framework for creating kernels and manipulating them.
Expand Down
71 changes: 40 additions & 31 deletions docs/src/kernels.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

# Base Kernels

These are the basic kernels without any transformation of the data. They are the building blocks of KernelFunctions
These are the basic kernels without any transformation of the data. They are the building blocks of KernelFunctions.


## Constant Kernels
Expand Down Expand Up @@ -86,21 +86,20 @@ The [`FBMKernel`](@ref) is defined as
k(x,x';h) = \frac{|x|^{2h} + |x'|^{2h} - |x-x'|^{2h}}{2},
```

where $h$ is the [Hurst index](https://en.wikipedia.org/wiki/Hurst_exponent#Generalized_exponent) and $0<h<1$.
where $h$ is the [Hurst index](https://en.wikipedia.org/wiki/Hurst_exponent#Generalized_exponent) and $0 < h < 1$.

## Gabor Kernel

The [`GaborKernel`](@ref) is defined as

```math
k(x,x'; l,p) =& h(x-x';l,p)\\
h(u;l,p) =& \exp\left(-\cos\left(\pi \sum_i \frac{u_i}{p_i}\right)\sum_i \frac{u_i^2}{l_i^2}\right),
k(x,x'; l,p) = \exp\left(-\cos\left(\pi \sum_i \frac{x_i - x'_i}{p_i}\right)\sum_i \frac{(x_i - x'_i)^2}{l_i^2}\right),
```
where $l_i >0 $ is the lengthscale and $p_i>0$ is the period.
where $l_i > 0$ is the lengthscale and $p_i > 0$ is the period.

## Matern Kernels
## Matérn Kernels

### Matern Kernel
### General Matérn Kernel

The [`MaternKernel`](@ref) is defined as

Expand All @@ -110,15 +109,23 @@ The [`MaternKernel`](@ref) is defined as

where $\nu > 0$.

### Matern 3/2 Kernel
### Matérn 1/2 Kernel

The Matérn 1/2 kernel is defined as
```math
k(x,x') = \exp\left(-|x-x'|\right),
```
equivalent to the Exponential kernel. `Matern12Kernel` is an alias for [`ExponentialKernel`](@ref).

### Matérn 3/2 Kernel

The [`Matern32Kernel`](@ref) is defined as

```math
k(x,x') = \left(1+\sqrt{3}|x-x'|\right)\exp\left(\sqrt{3}|x-x'|\right).
```

### Matern 5/2 Kernel
### Matérn 5/2 Kernel

The [`Matern52Kernel`](@ref) is defined as

Expand All @@ -128,7 +135,7 @@ The [`Matern52Kernel`](@ref) is defined as

## Neural Network Kernel

The [`NeuralNetworkKernel`](@ref) (as in the kernel for an infinitely wide neural network interpretated as a Gaussian process) is defined as
The [`NeuralNetworkKernel`](@ref) (as in the kernel for an infinitely wide neural network interpreted as a Gaussian process) is defined as

```math
k(x, x') = \arcsin\left(\frac{\langle x, x'\rangle}{\sqrt{(1+\langle x, x\rangle)(1+\langle x',x'\rangle)}}\right).
Expand All @@ -142,19 +149,23 @@ The [`PeriodicKernel`](@ref) is defined as
k(x,x';r) = \exp\left(-0.5 \sum_i (sin (π(x_i - x'_i))/r_i)^2\right),
```

where $r$ has the same dimension as $x$ and $r_i >0$.
where $r$ has the same dimension as $x$ and $r_i > 0$.

## Piecewise Polynomial Kernel

The [`PiecewisePolynomialKernel`](@ref) is defined as

The [`PiecewisePolynomialKernel`](@ref) is defined for $x, x'\in \mathbb{R}^D$, a positive-definite matrix $P \in \mathbb{R}^{D \times D}$, and $V \in \{0,1,2,3\}$ as
```math
k(x,x'; P, V) =& \max(1 - r, 0)^{j + V} f(r, j),\\
r =& x^\top P x',\\
j =& \lfloor \frac{D}{2}\rfloor + V + 1,
k(x,x'; P, V) = \max(1 - \sqrt{x^\top P x'}, 0)^{j + V} f_V(\sqrt{x^\top P x'}, j),
```
where $j = \lfloor \frac{D}{2}\rfloor + V + 1$, and $f_V$ are polynomials defined as follows:
```math
\begin{aligned}
f_0(r, j) &= 1, \\
f_1(r, j) &= 1 + (j + 1) r, \\
f_2(r, j) &= 1 + (j + 2) r + ((j^2 + 4j + 3) / 3) r^2, \\
f_3(r, j) &= 1 + (j + 3) r + ((6 j^2 + 36j + 45) / 15) r^2 + ((j^3 + 9 j^2 + 23j + 15) / 15) r^3.
\end{aligned}
```
where $x\in \mathbb{R}^D$, $V \in \{0,1,2,3\} and $P$ is a positive definite matrix.
$f$ is a piecewise polynomial (see source code).

## Polynomial Kernels

Expand All @@ -166,7 +177,7 @@ The [`LinearKernel`](@ref) is defined as
k(x,x';c) = \langle x,x'\rangle + c,
```

where $c \in \mathbb{R}$
where $c \in \mathbb{R}$.

### Polynomial Kernel

Expand All @@ -176,7 +187,7 @@ The [`PolynomialKernel`](@ref) is defined as
k(x,x';c,d) = \left(\langle x,x'\rangle + c\right)^d,
```

where $c \in \mathbb{R}$ and $d>0$
where $c \in \mathbb{R}$ and $d>0$.


## Rational Quadratic
Expand Down Expand Up @@ -223,43 +234,41 @@ where $i\in\{-1,0,1,2,3\}$ and coefficients $a_i$, $b_i$ are fixed and residuals

### Transformed Kernel

The [`TransformedKernel`](@ref) is a kernel where input are transformed via a function `f`
The [`TransformedKernel`](@ref) is a kernel where inputs are transformed via a function `f`:

```math
k(x,x';f,\widetile{k}) = \widetilde{k}(f(x),f(x')),
k(x,x';f,\widetilde{k}) = \widetilde{k}(f(x),f(x')),
```

Where $\widetilde{k}$ is another kernel and $f$ is an arbitrary mapping.
where $\widetilde{k}$ is another kernel and $f$ is an arbitrary mapping.

### Scaled Kernel

The [`ScaledKernel`](@ref) is defined as

```math
k(x,x';\sigma^2,\widetilde{k}) = \sigma^2\widetilde{k}(x,x')
k(x,x';\sigma^2,\widetilde{k}) = \sigma^2\widetilde{k}(x,x') ,
```

Where $\widetilde{k}$ is another kernel and $\sigma^2 > 0$.
where $\widetilde{k}$ is another kernel and $\sigma^2 > 0$.

### Kernel Sum

The [`KernelSum`](@ref) is defined as a sum of kernels
The [`KernelSum`](@ref) is defined as a sum of kernels:

```math
k(x, x'; \{k_i\}) = \sum_i k_i(x, x').
```

### KernelProduct
### Kernel Product

The [`KernelProduct`](@ref) is defined as a product of kernels
The [`KernelProduct`](@ref) is defined as a product of kernels:

```math
k(x,x';\{k_i\}) = \prod_i k_i(x,x').
```

### Tensor Product

The [`TensorProduct`](@ref) is defined as :
The [`TensorProduct`](@ref) is defined as:

```math
k(x,x';\{k_i\}) = \prod_i k_i(x_i,x'_i)
Expand Down
15 changes: 9 additions & 6 deletions docs/src/metrics.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,19 @@
# Metrics

KernelFunctions.jl relies on [Distances.jl](https://github.com/JuliaStats/Distances.jl) for computing the pairwise matrix.
To do so a distance measure is needed for each kernel. Two very common ones can already be used : `SqEuclidean` and `Euclidean`.
However all kernels do not rely on distances metrics respecting all the definitions. That's why additional metrics come with the package such as `DotProduct` (`<x,y>`) and `Delta` (`δ(x,y)`).
Note that every `SimpleKernel` must have a defined metric defined as :
`SimpleKernel` implementations rely on [Distances.jl](https://github.com/JuliaStats/Distances.jl) for efficiently computing the pairwise matrix.
This requires a distance measure or metric, such as the commonly used `SqEuclidean` and `Euclidean`.

The metric used by a given kernel type is specified as
```julia
KernelFunctions.metric(::CustomKernel) = SqEuclidean()
KernelFunctions.metric(::CustomKernel) = SqEuclidean()
```

However, there are kernels that can be implemented efficiently using "metrics" that do not respect all the definitions expected by Distances.jl. For this reason, KernelFunctions.jl provides additional "metrics" such as `DotProduct` ($\langle x, y \rangle$) and `Delta` ($\delta(x,y)$).


## Adding a new metric

If you want to create a new distance just implement the following :
If you want to create a new "metric" just implement the following:

```julia
struct Delta <: Distances.PreMetric
Expand Down
10 changes: 5 additions & 5 deletions docs/src/transform.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Transform
# Input Transforms

`Transform` is the object that takes care of transforming the input data before distances are being computed. It can be as standard as `IdentityTransform` returning the same input, or multiplying the data by a scalar with `ScaleTransform` or by a vector with `ARDTransform`.
There is a more general `Transform`: `FunctionTransform` that uses a function and apply it on each vector via `mapslices`.
You can also create a pipeline of `Transform` via `TransformChain`. For example `LowRankTransform(rand(10,5))∘ScaleTransform(2.0)`.
There is a more general `Transform`: `FunctionTransform` that uses a function and applies it on each vector via `mapslices`.
You can also create a pipeline of `Transform` via `TransformChain`. For example, `LowRankTransform(rand(10,5))∘ScaleTransform(2.0)`.

One apply a transformation on a matrix or a vector via `KernelFunctions.apply(t::Transform,v::AbstractVecOrMat)`
A transformation `t` can be applied to a matrix or a vector `v` via `KernelFunctions.apply(t, v)`.

Check the list on the [API page](@ref Transforms)
Check the full list of provided transforms on the [API page](@ref Transforms).
Loading

0 comments on commit 11008d6

Please sign in to comment.