-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Periodic kernel #61
Comments
Alternatively to the above, the approach below is somewhat simpler as it directly constructs a struct ApproxPeriodicKernel{J,T} <: KernelFunctions.SimpleKernel
l::T # length scale as defined in Arno's text book
end
# assuming length( k.r ) == 1
ApproxPeriodicKernel(k::PeriodicKernel{T}; degree_of_approx = 3) where {T} = ApproxPeriodicKernel{degree_of_approx,T}( 2*only(k.r) )
ApproxPeriodicKernel(; r=1/2, degree_of_approx = 3 ) = ApproxPeriodicKernel(PeriodicKernel(; r=[r]); degree_of_approx = degree_of_approx)
# k = ApproxPeriodicKernel()
function to_sde(k::ApproxPeriodicKernel{J,T}, s::SArrayStorage{T}) where {J,T<:Real}
D = 2*(J+1)
w0 = 2π
F = kron( Diagonal(0:J), [0 -w0; w0 0])
H = kron( ones(1, J+1),[1 0])
Fs = SMatrix{D,D,T}(F)
q = T(0)
Hs = SVector{D,T}(H)#SMatrix{1,D,T}(H)
return Fs, q, Hs
end
function stationary_distribution(k::ApproxPeriodicKernel{J,T}, s::SArrayStorage{T}) where {J,T<:Real}
D = 2*(J+1)
q = [ j>0 ? 2*besseli(j, k.l^-2)/exp(k.l^-2) : besseli(0, k.l^-2)/exp(k.l^-2) for j in 0:J]
P = kron(Diagonal(q), Diagonal([1, 1]))
m = SVector{D, T}( zeros(T,D) )
Ps = SMatrix{D,D,T}(P)
return Gaussian(m, Ps)
end |
Sorry for the slow response. I think this second approach is probably better for now -- I'm not sure how useful the oscillator kernel is likely to be outside of the context of approximating this stuff. |
Regarding the Zygote + StaticArrays problems -- this is something that needs fixing. I'll have a hack at a general solution this evening. |
Actually, looks like it'll have to be tomorrow unfortunately. |
no worries. I completely agree with the second approach |
@andreaskoher how's it going with this? Is there anything blocking it that I could assist with? |
thanks for the reminder :) const x = RegularSpacing(0., 0.01, 1000)
const y = rand( to_sde( GP( ApproxPeriodicKernel() ) )(x) )
function objective(θ)
k = θ * ApproxPeriodicKernel()
f = to_sde( GP(k), SArrayStorage(Float64))
-logpdf(f(x), y)
end
Zygote.gradient(objective, 1.)
It references the list comprehension in
I can fix the problem by iteratively applying function TemporalGPs.stationary_distribution(k::ApproxPeriodicKernel{J,T}, s::SArrayStorage{T}) where {J,T<:Real}
D = 2*(J+1)
q = [besseli(0, k.l^-2)/exp(k.l^-2)]
for j in 1:J
q = vcat(q, [2*besseli(j, 1/k.l^-2)/exp(k.l^-2)])
end
# q = [ j>0 ? 2*besseli(j, k.l^-2)/exp(k.l^-2) : besseli(0, k.l^-2)/exp(k.l^-2) for j in 0:J]
P = kron(Diagonal(q), Diagonal([1, 1]))
m = SVector{D, T}( zeros(T,D) )
Ps = SMatrix{D,D,T}(P)
return TemporalGPs.Gaussian(m, Ps)
end It looks somewhat inefficient to me but, I am also no expert on good coding style. In any case, with this fix, I can correctly infer the parameters: # Declare model parameters using `ParameterHandling.jl` types.
Random.seed!(1234)
params = (
s = positive(10.),
l = positive(1/20),
v = positive(.01),
)
function build_gp(ps)
k = ps.s* ApproxPeriodicKernel() ∘ ScaleTransform(ps.l)
return to_sde( GP(k), SArrayStorage(Float64))
end
# Generate some synthetic data from the prior.
x = RegularSpacing(0.0, 0.01, 10_000)
y = rand(build_gp(value(params))(x, value(params.v)))
# Construct mapping between structured and Vector representation of parameters.
flat_initial_params, unflatten = flatten(params)
# Specify an objective function for Optim to minimise in terms of x and y.
function objective(flat_params)
params = value(unflatten(flat_params))
f = build_gp(params)
return -logpdf(f(x, params.v), y)
end
# Optimise using Optim.
training_results = Optim.optimize(
objective,
θ -> only(Zygote.gradient(objective, θ)),
flat_initial_params+rand(4), # Add some noise to make learning non-trivial
Optim.LBFGS(
alphaguess = Optim.LineSearches.InitialStatic(scaled=true),
linesearch = Optim.LineSearches.BackTracking(),
),
Optim.Options(show_trace = true);
inplace=false,
)
# Extracting the final values of the parameters.
final_params = value(unflatten(training_results.minimizer))
# compare data to inference
f = build_gp(value(final_params))
fx = posterior(f(x, final_params.v),y)
plot(x,y)
plot!(x, fx(x, final_params.v)) The problem remains that I can still not infer the length scale
As a better workaround than using
and implement the analytically derived adjoints as in Arnos implementation in GPStuff. What do you think? |
Ahh okay -- this sounds like the kind of problem that I'd expect. We should be getting a fix for this soon.
I think this would be an excellent idea for now! Are you familiar with doing this using edit: also, sorry for the slow response time! |
no rush :) I added the following analytical derivatives: using ChainRulesCore
q0(l) = besseli(0, l^-2)/exp(l^-2)
δq0_δl(l) = l^-3/exp(l^-2)*(
2*besseli(0,l^-2) -
2*besseli(1,l^-2)
)
@scalar_rule(q0(x), δq0_δl(x))
qj(l,j) = 2*exp(-l^-2)*besseli(j,1/l^2)
δqj_δl(l,j) =
4besseli(j, 1/l^2) / exp( l^-2 ) / l^3 -
2(
besseli(j-1, 1/l^2) +
besseli(j+1, 1/l^2)
) / exp( l^-2 ) / l^3
@scalar_rule(qj(x,j), (δqj_δl(x,j), DoesNotExist())) and checked the gradients against ForwardDiff and FiniteDifferences. At this level everything seems fine, however, the following gradient with respect to the length scale const x = RegularSpacing(0., 0.01, 1000)
const y = rand( to_sde( GP( ApproxPeriodicKernel() ) )(x) )
function objective(θ)
k = ApproxPeriodicKernel( r=θ )
f = to_sde( GP(k), SArrayStorage(Float64))
-logpdf(f(x), y)
end
Zygote.gradient(objective, 1.)
As before, it references the same list comprehension in function TemporalGPs.stationary_distribution(k::ApproxPeriodicKernel{J,T}, s::SArrayStorage{T}) where {J,T<:Real}
D = 2*(J+1)
q = [ j>0 ? qj(k.l, j) : q0(k.l) for j in 0:J]
P = kron(Diagonal(q), Diagonal([1, 1]))
m = SVector{D, T}( zeros(T,D) )
Ps = SMatrix{D,D,T}(P)
return TemporalGPs.Gaussian(m, Ps)
end I will try and dig into the issue these days |
Hey there, I was wondering if there is any progress on this issue (and #64). I tried using the above approach with current versions of all involved libraries (changing DoesNotExist() to NoTangent()), but am running into
|
Hiya. Unfortunately I've not had time to work on this in the recent past, and I'm not entirely sure when I'm going to have the time in the near future. |
I am planning to work on this issue once #90 is finished |
Hi @willtebbutt,
From the ToDo list #59 I have tried to go ahead with the periodic kernel as proposed by Solin & Särkkä (2014). The periodic kernel is approximated by a Taylor series and I imagine there are multiple ways to implement the idea. One is to implement the approximated periodic kernel directly as a SimpleKernel but, here I chose to build it up from a sum of elementary oscillators (I am absolutely not sure which approach is more efficient though). Also, I assumed that the kernel takes only a one dimensional input as done in the paper, but probably this can be extended. Finally, I also had to modify
lgssm_components(k::KernelSum ...)
in order to allow for more than two kernel components.a simple usage example:
However, I ran into an issue with Zygote again, when calculating the gradient:
The text was updated successfully, but these errors were encountered: