Skip to content

Commit

Permalink
Update Docs (#280)
Browse files Browse the repository at this point in the history
* Improve ColVecs / RowVecs docstrings

* Explain AbstractVectors reasoning

* Improve MOInput docstring

* Explain MOInput in docs in detail

* Update userguide:

* Move API docs around

* Update kernelmatrix docs

* Update README

* Tweak style in userguide

* Remove blank space

* Mention reshape aesthetics

* Clarify API docs

* Clarify API docs

* Clarify API docs

* Clarify API docs

* Clarify API docs

* Clarify API docs

* Clarify API docs

* Clarify API docs

* Clarify API docs

* Clarify API docs

* Clarify API docs

* Clarify API docs

* Clarify API docs

* Clarify API docs

* Clarify API docs

* Clarify API docs

* Clarify API docs

* Clarify API docs

* Clarify API docs

* Fix line lengths in docs

* Fix line lengths in docs

* Fix line lengths in docs

* readme style

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

* readme style

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

* Fix readme comment

* Type annotations for kernelmatrix

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

* Type annotations for kernelmatrix_diag

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

* julia -> jldoctest

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

* Clarify API docs

* Fix API typo

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

* Fix API docs grammar

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

* Vector -> AbstractVector in API docs

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

* Remove redundant whitespace

* Format userguide for consistency

* Fix userguide grammar

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

* Mention colprac in contributing section

* Fix typo in api docs

Co-authored-by: Théo Galy-Fajou <theo.galyfajou@gmail.com>

* Reference Alvarez review paper

* Fix API docs typo

Co-authored-by: Théo Galy-Fajou <theo.galyfajou@gmail.com>

* Add headings to input types section

* Add some more context to API docs

* Move nystrom and kernelpdmat to utils

* Comment on utilities

* Re-generate kernel heatmap

* Finish off MOInput docstring

* Tidy up kernelmatrix docstring style

* transform -> composition

* Simplify README example

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

* Add filter to doctests

* Fix formatting

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Add filter to doctests

* Uncomment tests

* Fix grammar

Co-authored-by: st-- <st--@users.noreply.github.com>

* Clarify MO kernel explanation

Co-authored-by: st-- <st--@users.noreply.github.com>

* Clarify reference

Co-authored-by: st-- <st--@users.noreply.github.com>

* Grammar

Co-authored-by: st-- <st--@users.noreply.github.com>

* Grammar

Co-authored-by: st-- <st--@users.noreply.github.com>

* Phrasing

Co-authored-by: st-- <st--@users.noreply.github.com>

* Typo

Co-authored-by: st-- <st--@users.noreply.github.com>

* Title Case

Co-authored-by: st-- <st--@users.noreply.github.com>

* Typo

Co-authored-by: st-- <st--@users.noreply.github.com>

* Add Reference

Co-authored-by: st-- <st--@users.noreply.github.com>

* Typo

Co-authored-by: st-- <st--@users.noreply.github.com>

* Typo

Co-authored-by: st-- <st--@users.noreply.github.com>

* Typo

Co-authored-by: st-- <st--@users.noreply.github.com>

* Typo

Co-authored-by: st-- <st--@users.noreply.github.com>

* Typo

Co-authored-by: st-- <st--@users.noreply.github.com>

* Spacing

Co-authored-by: st-- <st--@users.noreply.github.com>

* Grammar

Co-authored-by: st-- <st--@users.noreply.github.com>

* Title Case

Co-authored-by: st-- <st--@users.noreply.github.com>

* Consistently use title case in docs

* Add note on compose

* Fix typo in docstring

Co-authored-by: st-- <st--@users.noreply.github.com>

* Link in MOInput docstring

* Fix broken links

* Move out abstract vector explanation

* Design docs

* Design on sidebar

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: Théo Galy-Fajou <theo.galyfajou@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: st-- <st--@users.noreply.github.com>
  • Loading branch information
5 people authored May 10, 2021
1 parent a99a3e7 commit df67ab7
Show file tree
Hide file tree
Showing 10 changed files with 456 additions and 86 deletions.
35 changes: 19 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,24 +17,28 @@ The aim is to make the API as model-agnostic as possible while still being user-
## Examples

```julia
X = reshape(collect(range(-3.0,3.0,length=100)),:,1)
# Set simple scaling of the data
k₁ = SqExponentialKernel()
K₁ = kernelmatrix(k₁,X,obsdim=1)
x = range(-3.0, 3.0; length=100)

# Set a function transformation on the data
k₂ = TransformedKernel(Matern32Kernel(),FunctionTransform(x->sin.(x)))
K₂ = kernelmatrix(k₂,X,obsdim=1)
# A simple standardised squared-exponential / exponentiated-quadratic kernel.
k₁ = SqExponentialKernel()
K₁ = kernelmatrix(k₁, x)

# Set a matrix premultiplication on the data
k₃ = transform(PolynomialKernel(c=2.0,d=2.0),LowRankTransform(randn(4,1)))
K₃ = kernelmatrix(k₃,X,obsdim=1)
# Set a function transformation on the data
k₂ = Matern32Kernel() FunctionTransform(sin)
K₂ = kernelmatrix(k₂, x)

# Add and sum kernels
k₄ = 0.5*SqExponentialKernel()*LinearKernel(c=0.5) + 0.4*k₂
K₄ = kernelmatrix(k₄,X,obsdim=1)
# Set a matrix premultiplication on the data
k₃ = PolynomialKernel(; c=2.0, degree=2) LinearTransform(randn(4, 1))
K₃ = kernelmatrix(k₃, x)

plot(heatmap.([K₁,K₂,K₃,K₄],yflip=true,colorbar=false)...,layout=(2,2),title=["K₁" "K₂" "K₃" "K₄"])
# Add and sum kernels
k₄ = 0.5 * SqExponentialKernel() * LinearKernel(; c=0.5) + 0.4 * k₂
K₄ = kernelmatrix(k₄, x)

plot(
heatmap.([K₁, K₂, K₃, K₄]; yflip=true, colorbar=false)...;
layout=(2, 2), title=["K₁" "K₂" "K₃" "K₄"],
)
```
<p align=center>
<img src="docs/src/assets/heatmap_combination.png" width=400px>
Expand All @@ -43,10 +47,9 @@ The aim is to make the API as model-agnostic as possible while still being user-
## Packages goals (by priority)
- Ensure AD Compatibility (already the case for Zygote, ForwardDiff)
- Toeplitz Matrices compatibility
- BLAS backend

Directly inspired by the [MLKernels](https://github.com/trthatcher/MLKernels.jl) package.

## Issues/Contributing

If you notice a problem or would like to contribute by adding more kernel functions or features please [submit an issue](https://github.com/JuliaGaussianProcesses/KernelFunctions.jl/issues).
If you notice a problem or would like to contribute by adding more kernel functions or features please [submit an issue](https://github.com/JuliaGaussianProcesses/KernelFunctions.jl/issues), or open a PR (please see the [ColPrac](https://github.com/SciML/ColPrac) contribution guidelines).
6 changes: 6 additions & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,15 @@ makedocs(;
"create_kernel.md",
"API" => "api.md",
"Examples" => "example.md",
"Design" => "design.md",
],
strict=true,
checkdocs=:exports,
doctestfilters=[
r"{([a-zA-Z0-9]+,\s?)+[a-zA-Z0-9]+}",
r"(Array{[a-zA-Z0-9]+,\s?1}|Vector{[a-zA-Z0-9]+})",
r"(Array{[a-zA-Z0-9]+,\s?2}|Matrix{[a-zA-Z0-9]+})",
],
)

deploydocs(;
Expand Down
68 changes: 54 additions & 14 deletions docs/src/api.md
Original file line number Diff line number Diff line change
@@ -1,38 +1,78 @@
# API Library

---
```@contents
Pages = ["api.md"]
```

```@meta
CurrentModule = KernelFunctions
```

## Functions

The KernelFunctions API comprises the following four functions.
```@docs
kernelmatrix
kernelmatrix!
kernelmatrix_diag
kernelmatrix_diag!
kernelpdmat
nystrom
```

## Utilities
## Input Types

The above API operates on collections of inputs.
All collections of inputs in KernelFunctions.jl are represented as `AbstractVector`s.
To understand this choice, please see the [design notes on collections of inputs](@ref why_abstract_vectors).
The length of any such `AbstractVector` is equal to the number of inputs in the collection.
For example, this means that
```julia
size(kernelmatrix(k, x)) == (length(x), length(x))
```
is always true, for some `Kernel` `k`, and `AbstractVector` `x`.

### Univariate Inputs

If each input to your kernel is `Real`-valued, then any `AbstractVector{<:Real}` is a valid
representation for a collection of inputs.
More generally, it's completely fine to represent a collection of inputs of type `T` as, for
example, a `Vector{T}`.
However, this may not be the most efficient way to represent collection of inputs.
See [Vector-Valued Inputs](@ref) for an example.


### Vector-Valued Inputs

We recommend that collections of vector-valued inputs are stored in an
`AbstractMatrix{<:Real}` when possible, and wrapped inside a `ColVecs` or `RowVecs` to make
their interpretation clear:
```@docs
ColVecs
RowVecs
```
These types are specialised upon to ensure good performance e.g. when computing Euclidean distances between pairs of elements.
The benefit of using this representation, rather than using a `Vector{Vector{<:Real}}`, is that
optimised matrix-matrix multiplication functionality can be utilised when computing
pairwise distances between inputs, which are needed for `kernelmatrix` computation.

### Inputs for Multiple Outputs

KernelFunctions.jl views multi-output GPs as GPs on an extended input domain.
For an explanation of this design choice, see [the design notes on multi-output GPs](@ref inputs_for_multiple_outputs).

An input to a multi-output `Kernel` should be a `Tuple{T, Int}`, whose first element specifies a location in the domain of the multi-output GP, and whose second element specifies which output the inputs corresponds to.
The type of collections of inputs for multi-output GPs is therefore `AbstractVector{<:Tuple{T, Int}}`.

KernelFunctions.jl provides the following type or situations in which all outputs are observed all of the time:
```@docs
MOInput
NystromFact
```
As with [`ColVecs`](@ref) and [`RowVecs`](@ref) for vector-valued input spaces, this
type enables specialised implementations of e.g. [`kernelmatrix`](@ref) for
[`MOInput`](@ref)s in some situations.

## Index
To find out more about the background, read this [review of kernels for vector-valued functions](https://arxiv.org/pdf/1106.6251.pdf).

```@index
Pages = ["api.md"]
Module = ["KernelFunctions"]
Order = [:type, :function]
## Utilities

KernelFunctions also provides miscellaneous utility functions.
```@docs
kernelpdmat
nystrom
NystromFact
```
Binary file modified docs/src/assets/heatmap_combination.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
172 changes: 172 additions & 0 deletions docs/src/design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
# Design

## [Why AbstractVectors Everywhere?](@id why_abstract_vectors)

To understand the advantages of using `AbstractVector`s everywhere to represent collections of inputs, first consider the following properties that it is desirable for a collection of inputs to satisfy.

#### Unique Ordering

There must be a clearly-defined first, second, etc element of an input collection.
If this were not the case, it would not be possible to determine a unique mapping between a collection of inputs and the output of `kernelmatrix`, as it would not be clear what order the rows and columns of the output should appear in.

Moreover, ordering guarantees that if you permute the collection of inputs, the ordering of the rows and columns of the `kernelmatrix` are correspondingly permuted.

#### Generality

There must be no restriction on the domain of the input.
Collections of `Real`s, vectors, graphs, finite-dimensional domains, or really anything else that you fancy should be straightforwardly representable.
Moreover, whichever input class is chosen should not prevent optimal performance from being obtained.

#### Unambiguously-Defined Length

Knowing the length of a collection of inputs is important.
For example, a well-defined length guarantees that the size of the output of `kernelmatrix`,
and related functions, are predictable.
It also makes it possible to perform internal error-checking that ensures that e.g. there
are the same number of inputs in two collections of inputs.



### AbstractMatrices Do Not Cut It

Notably, while `AbstractMatrix` objects are often used to represent collections of vector-valued
inputs, they do _not_ immediately satisfy these properties as it is unclear whether a matrix
of size `P x Q` represents a collection of `P` `Q`-dimensional inputs (each row is an
input), or `Q` `P`-dimensional inputs (each column is an input).

Moreover, they occassionally add some aesthetic inconvenience.
For example, a collection of `Real`-valued inputs, which might be straightforwardly
represented as an `AbstractVector{<:Real}`, must be reshaped into a matrix.

There are two commonly used ways to partly resolve these shortcomings:

#### Resolution 1: Specify a Convention

One way that these shortcomings can be partly resolved is by specifying a convention that
everyone adheres to regarding the interpretation of rows vs columns.
However, opinions about the choice of convention are often surprisingly strongly held, and
users regularly have to remind themselves _which_ convention has been chosen.
While this resolves the ordering problem, and in principle defines the "length" of a
collection of inputs, `AbstractMatrix`s already have a `length` defined in Julia, which
would generally disagree with our internal notion of `length`.
This isn't a show-stopper, but it isn't an especially clean situation.

There is also the opportunity for some kinds of silent bugs.
For example, if an input matrix happens to be square because the number of input dimensions
is the same as the number of inputs, it would be hard to know whether the correct
`kernelmatrix` has been computed.
This kind of bug seems unlikely, but it exists regardless.

Finally, suppose that your inputs are some type `T` that is not simply a vector of real
numbers, say a graph.
In this situation, how should a collection of inputs be represented?
A `N x 1` or `1 x N` matrix is the only obvious candidate, but the additional singular
dimension seems somewhat redundant.

#### Resolution 2: Always Specify An `obsdim` Argument

Another way to partly resolve these problems is to not commit to a convention, and instead
to propagate some additional information through the codebase that specifies how the input
data is to be interpretted.
For example, a kernel `k` that represents the sum of two other kernels might implement
`kernelmatrix` as follows:
```julia
function kernelmatrix(k::KernelSum, x::AbstractMatrix; obsdim=1)
return kernelmatrix(k.kernels[1], x; obsdim=obsdim) +
kernelmatrix(k.kernels[2], x; obsdim=obsdim)
end
```
While this prevents this package from having to pre-specify a convention, it doesn't resolve
the `length` issue, or the issue of representing collections of inputs which aren't
immediately represented as vectors.
Moreover, it complicates the internals; in contrast, consider what this function looks like
with an `AbstractVector`:
```julia
function kernelmatrix(k::KernelSum, x::AbstractVector)
return kernelmatrix(k.kernels[1], x) + kernelmatrix(k.kernels[2], x)
end
```
This code is clearer (less visual noise), and has removed a possible bug -- if the
implementer of `kernelmatrix` forgets to pass the `obsdim` kwarg into each subsequent
`kernelmatrix` call, it's possible to get the wrong answer.

This being said, we do support matrix-valued inputs -- see
[Why We Have Support for Both](@ref).


### AbstractVectors

Requiring all collections of inputs to be `AbstractVector`s resolves all of these problems,
and ensures that the data is self-describing to the extent that KernelFunctions.jl requires.

Firstly, the question of how to interpret the columns and rows of a matrix of inputs is
resolved.
Users _must_ wrap matrices which represent collections of inputs in either a `ColVecs` or
`RowVecs`, both of which have clearly defined semantics which are hard to confuse.

By design, there is also no discrepancy between the number of inputs in the collection, and
the `length` function -- the `length` of a `ColVecs`, `RowVecs`, or `Vector{<:Real}` is
equal to the number of inputs.

There is no loss of performance.

A collection of `N` `Real`-valued inputs can be represented by an
`AbstractVector{<:Real}` of `length` `N`, rather than needing to use an
`AbstractMatrix{<:Real}` of size either `N x 1` or `1 x N`.
The same can be said for any other input type `T`, and new subtypes of `AbstractVector` can
be added if particularly efficient ways exist to store collections of inputs of type `T`.
A good example of this in practice is using `Tuple{S, Int}`, for some input type `S`, as the
[Inputs for Multiple Outputs](@ref).

This approach can also lead to clearer user code.
A user need only wrap their inputs in a `ColVecs` or `RowVecs` once in their code, and this
specification is automatically re-used _everywhere_ in their code.
In this sense, it is straightforward to write code in such a way that there is one unique
source of "truth" about the way in which a particular data set should be interpreted.
Conversely, the `obsdim` resolution requires that the `obsdim` keyword argument is passed
around with the data _every_ _single_ _time_ that you use it.

The benefits of the `AbstractVector` approach are likely most strongly felt when writing a substantial amount of code on top of KernelFunctions.jl -- in the same way that using
`AbstractVector`s inside KernelFunctions.jl removes the need for large amounts of keyword
argument propagation, the same will be true of other code.




### Why We Have Support for Both

In short: many people like matrices, and are familiar with `obsdim`-style keyword
arguments.

All internals are implemented using `AbstractVector`s though, and the `obsdim` interface
is just a thin layer of utility functionality which sits on top of this.





## [Kernels for Multiple-Outputs](@id inputs_for_multiple_outputs)

There are two equally-valid perspectives on multi-output kernels: they can either be treated
as matrix-valued kernels, or standard kernels on an extended input domain.
Each of these perspectives are convenient in different circumstances, but the latter
greatly simplifies the incorporation of multi-output kernels in KernelFunctions.

More concretely, let `k_mat` be a matrix-valued kernel, mapping pairs of inputs of type `T` to matrices of size `P x P` to describe the covariance between `P` outputs.
Given inputs `x` and `y` of type `T`, and integers `p` and `q`, we can always find an
equivalent standard kernel `k` mapping from pairs of inputs of type `Tuple{T, Int}` to the
`Real`s as follows:
```julia
k((x, p), (y, q)) = k_mat(x, y)[p, q]
```
This ability to treat multi-output kernels as single-output kernels is very helpful, as it
means that there is no need to introduce additional concepts into the API of
KernelFunctions.jl, just additional kernels!
This in turn simplifies downstream code as they don't need to "know" about the existence of
multi-output kernels in addition to standard kernels. For example, GP libraries built on
top of KernelFunctions.jl just need to know about `Kernel`s, and they get multi-output
kernels, and hence multi-output GPs, for free.

Where there is the need to specialise _implementations_ for multi-output kernels, this is
done in an encapsulated manner -- parts of KernelFunctions that have nothing to do with
multi-output kernels know _nothing_ about the existence of multi-output kernels.
Loading

0 comments on commit df67ab7

Please sign in to comment.