-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU Attempt 2 #472
base: master
Are you sure you want to change the base?
GPU Attempt 2 #472
Changes from all commits
35f4d97
935e3aa
f09ae92
3c6ed73
42cfdf0
d40301d
457609a
8372ed4
608b174
7f21ec2
11a5b20
12fe669
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,14 @@ | ||||||||||
steps: | ||||||||||
- label: "Julia v1" | ||||||||||
plugins: | ||||||||||
- JuliaCI/julia#v1: | ||||||||||
version: "1" | ||||||||||
- JuliaCI/julia-test#v1: ~ | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
agents: | ||||||||||
queue: "juliagpu" | ||||||||||
cuda: "*" | ||||||||||
if: build.message !~ /\[skip tests\]/ | ||||||||||
timeout_in_minutes: 60 | ||||||||||
|
||||||||||
env: | ||||||||||
GROUP: CUDA | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -47,6 +47,7 @@ export IndependentMOKernel, | |
export tensor, ⊗, compose | ||
|
||
using Compat | ||
using CUDA | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unless things changed, that's quite a massive dependency no? I saw CuArrays.jl got deprecated, does this mean that CUDA is light now? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As we already depend on Requires, maybe it could be hidden behind a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Alternatively, maybe the lightweight |
||
using ChainRulesCore: ChainRulesCore, Tangent, ZeroTangent, NoTangent | ||
using ChainRulesCore: @thunk, InplaceableThunk | ||
using CompositionsBase | ||
|
@@ -75,6 +76,7 @@ include("distances/pairwise.jl") | |
include("distances/dotproduct.jl") | ||
include("distances/delta.jl") | ||
include("distances/sinus.jl") | ||
include("distances/cuda.jl") | ||
|
||
include("transform/transform.jl") | ||
include("transform/scaletransform.jl") | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,8 @@ | ||
module TestUtils | ||
|
||
using CUDA | ||
using Distances | ||
using FillArrays | ||
using LinearAlgebra | ||
using KernelFunctions | ||
using Random | ||
|
@@ -20,9 +22,9 @@ Run various consistency checks on `k` at the inputs `x0`, `x1`, and `x2`. | |
`x0` and `x1` should be of the same length with different values, while `x0` and `x2` should | ||
be of different lengths. | ||
|
||
These tests are intended to pick up on really substantial issues with a kernel implementation | ||
(e.g. substantial asymmetry in the kernel matrix, large negative eigenvalues), rather than to | ||
test the numerics in detail, which can be kernel-specific. | ||
These tests are intended to pick up on really substantial issues with a kernel | ||
implementation (e.g. substantial asymmetry in the kernel matrix, large negative | ||
eigenvalues), rather than to test the numerics in detail, which can be kernel-specific. | ||
""" | ||
function test_interface( | ||
k::Kernel, | ||
|
@@ -118,6 +120,18 @@ function test_interface( | |
) | ||
end | ||
|
||
function test_interface( | ||
rng::AbstractRNG, k::Kernel, ::Type{<:ColVecs{T, CuMatrix{T}}}; dim_in=2, kwargs... | ||
willtebbutt marked this conversation as resolved.
Show resolved
Hide resolved
|
||
) where {T<:Real} | ||
return test_interface( | ||
k, | ||
_to_cuda_gpu(ColVecs(randn(rng, T, dim_in, 11))), | ||
_to_cuda_gpu(ColVecs(randn(rng, T, dim_in, 11))), | ||
_to_cuda_gpu(ColVecs(randn(rng, T, dim_in, 13))); | ||
kwargs..., | ||
) | ||
end | ||
|
||
function test_interface( | ||
rng::AbstractRNG, k::Kernel, ::Type{<:RowVecs{T}}; dim_in=2, kwargs... | ||
) where {T<:Real} | ||
|
@@ -176,27 +190,108 @@ function test_interface(k::Kernel, T::Type{<:Real}=Float64; kwargs...) | |
return test_interface(Random.GLOBAL_RNG, k, T; kwargs...) | ||
end | ||
|
||
const FloatType = Union{Float32, Float64} | ||
willtebbutt marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not AbstractFloat? |
||
|
||
""" | ||
example_inputs(rng::AbstractRNG, type) | ||
|
||
Return a tuple of 4 inputs of type `type`. See `methods(example_inputs)` for information | ||
around supported types. It is recommended that you utilise `StableRNGs.jl` for `rng` here | ||
to ensure consistency across Julia versions. | ||
""" | ||
function example_inputs(rng::AbstractRNG, ::Type{Vector{Float64}}) | ||
return map(n -> randn(rng, Float64, n), (1, 2, 3, 4)) | ||
function example_inputs(rng::AbstractRNG, ::Type{Vector{T}}) where {T<:FloatType} | ||
return map(n -> randn(rng, T, n), (1, 2, 3, 4)) | ||
end | ||
|
||
function example_inputs( | ||
rng::AbstractRNG, ::Type{ColVecs{Float64,Matrix{Float64}}}; dim::Int=2 | ||
) | ||
return map(n -> ColVecs(randn(rng, dim, n)), (1, 2, 3, 4)) | ||
rng::AbstractRNG, ::Type{ColVecs{T,Matrix{T}}}; dim::Int=2 | ||
) where {T<:FloatType} | ||
return map(n -> ColVecs(randn(rng, T, dim, n)), (1, 2, 3, 4)) | ||
end | ||
|
||
function example_inputs( | ||
rng::AbstractRNG, ::Type{RowVecs{Float64,Matrix{Float64}}}; dim::Int=2 | ||
rng::AbstractRNG, ::Type{RowVecs{T,Matrix{T}}}; dim::Int=2 | ||
) where {T<:FloatType} | ||
return map(n -> RowVecs(randn(rng, T, n, dim)), (1, 2, 3, 4)) | ||
end | ||
|
||
function example_inputs(rng::AbstractRNG, ::Type{CuVector{T}}) where {T<:FloatType} | ||
return map(_to_cuda_gpu, example_inputs(rng, Vector{T})) | ||
end | ||
|
||
function example_inputs( | ||
rng::AbstractRNG, ::Type{ColVecs{T, CuMatrix{T}}}; dim::Int=2 | ||
willtebbutt marked this conversation as resolved.
Show resolved
Hide resolved
|
||
) where {T<:FloatType} | ||
return map(_to_cuda_gpu, example_inputs(rng, ColVecs{T, Matrix{T}}; dim=dim)) | ||
willtebbutt marked this conversation as resolved.
Show resolved
Hide resolved
|
||
end | ||
|
||
function example_inputs( | ||
rng::AbstractRNG, ::Type{RowVecs{T, CuMatrix{T}}}; dim::Int=2 | ||
willtebbutt marked this conversation as resolved.
Show resolved
Hide resolved
|
||
) where {T<:FloatType} | ||
return map(_to_cuda_gpu, example_inputs(rng, RowVecs{T, Matrix{T}}; dim=dim)) | ||
willtebbutt marked this conversation as resolved.
Show resolved
Hide resolved
|
||
end | ||
|
||
function test_gpu_against_cpu( | ||
k::Kernel, T::Type, x1::AbstractVector, x2::AbstractVector; atol=1e-3 | ||
) | ||
return map(n -> RowVecs(randn(rng, n, dim)), (1, 2, 3, 4)) | ||
@assert length(x1) != length(x2) | ||
|
||
k_cpu = _to_cpu(k) | ||
x1_cpu = _to_cpu(x1) | ||
x2_cpu = _to_cpu(x2) | ||
let | ||
K_cpu = kernelmatrix(k_cpu, x1_cpu) | ||
K_gpu = kernelmatrix(k, x1) | ||
@test size(K_cpu) == size(K_gpu) | ||
@test eltype(K_cpu) == T | ||
@test eltype(K_cpu) == eltype(K_gpu) | ||
@test isapprox(K_cpu, _to_cpu(K_gpu); atol=atol) | ||
end | ||
let | ||
K_cpu = kernelmatrix(k_cpu, x1_cpu, x2_cpu) | ||
K_gpu = kernelmatrix(k, x1, x2) | ||
@test size(K_cpu) == size(K_gpu) | ||
@test eltype(K_cpu) == T | ||
@test eltype(K_cpu) == eltype(K_gpu) | ||
@test isapprox(K_cpu, _to_cpu(K_gpu); atol=atol) | ||
end | ||
let | ||
K_cpu = kernelmatrix_diag(k_cpu, x1_cpu) | ||
K_gpu = kernelmatrix_diag(k, x1) | ||
@test size(K_cpu) == size(K_gpu) | ||
@test eltype(K_cpu) == T | ||
@test eltype(K_cpu) == eltype(K_gpu) | ||
@test isapprox(K_cpu, _to_cpu(K_gpu); atol=atol) | ||
end | ||
let | ||
K_cpu = kernelmatrix_diag(k_cpu, x1_cpu, x1_cpu) | ||
K_gpu = kernelmatrix_diag(k, x1, x1) | ||
@test size(K_cpu) == size(K_gpu) | ||
@test eltype(K_cpu) == T | ||
@test eltype(K_cpu) == eltype(K_gpu) | ||
@test isapprox(K_cpu, _to_cpu(K_gpu); atol=atol) | ||
end | ||
end | ||
|
||
function test_gpu_against_cpu(rng::AbstractRNG, T::Type, k::Kernel, data_type::Type) | ||
_, x1, x2, _ = example_inputs(rng, data_type) | ||
test_gpu_against_cpu(k, T, x1, x2) | ||
willtebbutt marked this conversation as resolved.
Show resolved
Hide resolved
|
||
end | ||
|
||
_to_cpu(x::CuArray{<:Real}) = Array(x) | ||
_to_cpu(x::ColVecs{<:Real, <:CuMatrix}) = ColVecs(_to_cpu(x.X)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can't we just overload There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could we use/define Adapt.adapt_structure(to, x::ColVecs{<:Real}) = ColVecs(adapt(to, x.X))
Adapt.adapt_structure(to, x::RowVecs{<:Real}) = RowVecs(adapt(to, x.X)) And then one could use e.g. |
||
_to_cpu(x::RowVecs{<:Real, <:CuMatrix}) = RowVecs(_to_cpu(x.X)) | ||
willtebbutt marked this conversation as resolved.
Show resolved
Hide resolved
|
||
_to_cpu(x::FillArrays.AbstractFill{<:Real}) = x | ||
|
||
_to_cpu(k::Kernel) = k | ||
_to_cpu(k::TransformedKernel) = TransformedKernel(_to_cpu(k.kernel), _to_cpu(k.transform)) | ||
_to_cpu(t::Transform) = t | ||
_to_cpu(t::LinearTransform) = LinearTransform(_to_cpu(t.A)) | ||
_to_cpu(t::ChainTransform) = ChainTransform(map(_to_cpu, t.transforms)) | ||
|
||
_to_cuda_gpu(x::Array{<:Real}) = CuArray(x) | ||
_to_cuda_gpu(x::ColVecs{T, Matrix{T}} where {T<:Real}) = ColVecs(_to_cuda_gpu(x.X)) | ||
_to_cuda_gpu(x::RowVecs{T, Matrix{T}} where {T<:Real}) = RowVecs(_to_cuda_gpu(x.X)) | ||
willtebbutt marked this conversation as resolved.
Show resolved
Hide resolved
|
||
_to_cuda_gpu(x::FillArrays.AbstractFill{<:Real}) = x | ||
|
||
end # module |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -48,9 +48,19 @@ _mod(x::AbstractVector{<:AbstractVector{<:Real}}) = sum.(abs2, x) | |
_mod(x::ColVecs) = vec(sum(abs2, x.X; dims=1)) | ||
_mod(x::RowVecs) = vec(sum(abs2, x.X; dims=2)) | ||
|
||
_ones(el_type, x::AbstractVector, sz...) = ones(el_type, sz...) | ||
function _ones( | ||
el_type, | ||
x::Union{CuVector{<:Real}, ColVecs{<:Real, <:CuMatrix}, RowVecs{<:Real, <:CuMatrix}}, | ||
willtebbutt marked this conversation as resolved.
Show resolved
Hide resolved
|
||
sz..., | ||
) | ||
return CUDA.ones(el_type, sz...) | ||
end | ||
Comment on lines
+51
to
+58
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could we use |
||
|
||
function kernelmatrix(κ::FBMKernel, x::AbstractVector) | ||
modx = _mod(x) | ||
modx_wide = modx * ones(eltype(modx), 1, length(modx)) # ad perf hack -- is unit tested | ||
_v = _ones(eltype(modx), x, 1, length(modx)) | ||
modx_wide = modx * _v # ad perf hack -- is unit tested | ||
modxx = pairwise(SqEuclidean(), x) | ||
return _fbm.(modx_wide, modx_wide', modxx, only(κ.h)) | ||
end | ||
|
@@ -64,8 +74,10 @@ end | |
|
||
function kernelmatrix(κ::FBMKernel, x::AbstractVector, y::AbstractVector) | ||
modxy = pairwise(SqEuclidean(), x, y) | ||
modx_wide = _mod(x) * ones(eltype(modxy), 1, length(y)) # ad perf hack -- is unit tested | ||
mody_wide = _mod(y) * ones(eltype(modxy), 1, length(x)) # ad perf hack -- is unit tested | ||
_vx = _ones(eltype(modxy), x, 1, length(y)) | ||
_vy = _ones(eltype(modxy), y, 1, length(x)) | ||
modx_wide = _mod(x) * _vx # ad perf hack -- is unit tested | ||
mody_wide = _mod(y) * _vy # ad perf hack -- is unit tested | ||
return _fbm.(modx_wide, mody_wide', modxy, only(κ.h)) | ||
end | ||
|
||
|
Original file line number | Diff line number | Diff line change | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -77,7 +77,7 @@ end | |||||||||||
|
||||||||||||
Matern32Kernel(; metric=Euclidean()) = Matern32Kernel(metric) | ||||||||||||
|
||||||||||||
kappa(::Matern32Kernel, d::Real) = (1 + sqrt(3) * d) * exp(-sqrt(3) * d) | ||||||||||||
kappa(::Matern32Kernel, d::T) where {T<:Real} = (1 + T(sqrt(3)) * d) * exp(-T(sqrt(3)) * d) | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can't T(sqrt(3)), or at least sqrt(3) be precalculated? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One could use
Suggested change
and load |
||||||||||||
|
||||||||||||
metric(k::Matern32Kernel) = k.metric | ||||||||||||
|
||||||||||||
|
@@ -108,7 +108,7 @@ end | |||||||||||
|
||||||||||||
Matern52Kernel(; metric=Euclidean()) = Matern52Kernel(metric) | ||||||||||||
|
||||||||||||
kappa(::Matern52Kernel, d::Real) = (1 + sqrt(5) * d + 5 * d^2 / 3) * exp(-sqrt(5) * d) | ||||||||||||
kappa(::Matern52Kernel, d::T) where {T<:Real} = (1 + T(sqrt(5)) * d + 5 * d^2 / 3) * exp(-T(sqrt(5)) * d) | ||||||||||||
willtebbutt marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same here with sqrt(5) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unfortunately, IrrationalConstants doesn't define
Suggested change
This would also be a bit safer as it also works if |
||||||||||||
|
||||||||||||
metric(k::Matern52Kernel) = k.metric | ||||||||||||
|
||||||||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -62,22 +62,22 @@ function (::WienerKernel{1})(x, y) | |
X = sqrt(sum(abs2, x)) | ||
Y = sqrt(sum(abs2, y)) | ||
minXY = min(X, Y) | ||
return 1 / 3 * minXY^3 + 1 / 2 * minXY^2 * euclidean(x, y) | ||
return minXY^3 / 3 + minXY^2 * euclidean(x, y) / 2 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice catches here! |
||
end | ||
|
||
function (::WienerKernel{2})(x, y) | ||
X = sqrt(sum(abs2, x)) | ||
Y = sqrt(sum(abs2, y)) | ||
minXY = min(X, Y) | ||
return 1 / 20 * minXY^5 + 1 / 12 * minXY^3 * euclidean(x, y) * (X + Y - 1 / 2 * minXY) | ||
return minXY^5 / 20 + minXY^3 * euclidean(x, y) * (X + Y - minXY / 2) / 12 | ||
end | ||
|
||
function (::WienerKernel{3})(x, y) | ||
X = sqrt(sum(abs2, x)) | ||
Y = sqrt(sum(abs2, y)) | ||
minXY = min(X, Y) | ||
return 1 / 252 * minXY^7 + | ||
1 / 720 * minXY^4 * euclidean(x, y) * (5 * max(X, Y)^2 + 2 * X * Y + 3 * minXY^2) | ||
return minXY^7 / 252 + | ||
minXY^4 * euclidean(x, y) * (5 * max(X, Y)^2 + 2 * X * Y + 3 * minXY^2) / 720 | ||
end | ||
|
||
function Base.show(io::IO, ::WienerKernel{I}) where {I} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noice!