You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using pystan version 2.9.0. I have a model based on a truncated incomplete gamma distribution.
When running optimizing(), the results depend very heavily on which seed I choose, and it seems to have something to do with the boundary values. That is, for one seed (121010), the optimization terminates after 2 iterations with all parameters at their boundaries (there are four parameters). I was firstly getting "Convergence detected: gradient norm is below tolerance", then when I set tol_grad=0, I got "relative gradient magnitude is below tolerance" and then when I set tol_rel_grad=0 I got "absolute change in objective function was below tolerance". I fear setting anything more to zero will mean there's nothing to terminate the iterations.
When I set the algorithm to "Newton" it works (doesn't seem to hit the boundaries). When setting to "BFGS" it has the same outcome as LBFGS.
When using seed of 1234 (without any of those options specified), the result comes out nicely.
I've verified that the lp__ values are correct at several of the iterations (modulo constant).
The LBFGS method seems to work every time when using the function from scipy.
I can list my stan code:
functions {
/**
* gammainc() is the upper incomplete gamma function (not regularized)
* @param real a, shape parameter
* @param real x, position > 0
* @returns the non-regularized incomplete gamma
* NOTES: uses a recursion relation to calculate values for negative a.
*/
real gammainc(real a, real x){
int n;
real ap1;
real ssum;
if(a>=0) return gamma_q(a,x) * tgamma(a);
ap1 <- a+1;
//Get floor(-a)
n<-0;
while(n<-a){
n <- n+1;
}
//Get summed part
{
vector[n] sums;
for(i in 0:n-1) sums[i+1] <- pow(x,i)/tgamma(ap1+i);
ssum <- sum(sums);
}
return tgamma(a)*(gamma_q(a+n,x)-pow(x,a)*exp(-x)*ssum);
}
/**
* truncated_logGGD_log gives the log PDF of a variate whose exponential has a
* lower-truncated generalised gamma distribution.
* @param vector y, variate (should be log10(m/Hs) in the usual TGGD)
* @param real ymin, truncation in y-space.
* @param real alpha, power-law slope
* @param real beta, cut-off parameter
*/
real truncated_logGGD_log(vector y, real ymin, real h, real alpha, real beta, real lnA){
vector[num_elements(y)] x;
real xmin;
real z;
z <- (alpha+1)/beta;
x <- exp(log10()*y*beta);
xmin <- exp(log10()*ymin*beta);
return sum(lnA + log(beta) + log(log10()) + log10()*h + log10()*y*(alpha+1) - x) - exp(lnA)*pow(10,h)*gammainc(z,xmin);
}
}
data {
int<lower=0> N; // number of halos
vector<lower=0>[N] log_m; // measured halo masses
real<lower=0> V; // Volume of the survey
// CONTROLS FOR PARAMETER BOUNDS
real<lower=0> hs_min; // Lower bound of logHs
real<lower=0,upper=20> hs_max; // Upper bound of logHs
real<lower=-2,upper=0> alpha_min; // Lower bound of alpha
real<lower=-2,upper=0> alpha_max; // Upper bound of alpha
real<lower=0> beta_min; // Lower bound of beta
real<lower=0> beta_max; // Upper bound of beta
real lnA_min; // lower bound of lnA
real lnA_max; // upper bound of lnA
}
transformed data {
real<lower=0> log_mmin;
log_mmin <- min(log_m);
}
parameters {
real<lower=hs_min,upper=hs_max> logHs; // Characteristic halo mass
real<lower=alpha_min,upper=alpha_max> alpha; // Power-law slope
real<lower=beta_min,upper=beta_max> beta; // Cut-off parameter
real<lower=lnA_min,upper=lnA_max> lnA; // Normalisation
}
transformed parameters {
real raw_lnA;
raw_lnA <- lnA + log(V);
}
model {
vector[N] y;
real ymin;
y <- log_m-logHs;
ymin <- log_mmin-logHs;
y ~ truncated_logGGD(ymin, logHs, alpha, beta, raw_lnA);
}
from a distribution with Expected Values:
'logHs': 14.294849118059398,
'alpha': -1.868803779752156,
'beta': 0.6976327799172357,
'lnA': -42.58691124225525
I set the bounds as
logHs -- (13.5,15.5)
alpha -- (-1.99,-1.7)
beta -- (0.4,1.2)
lnA -- (-48,-37).
and V is 45794539.9853329.
I was previously setting the bounds to be wider on each parameter, but contracting the support has not solved the problem.
I'm hoping I've provided enough information. Unfortunately I haven't tested if this happens for simpler problems -- so I don't know if it is problem-specific, or a more general problem within Stan. Since it doesn't always happen for my own problem, I would be hesitant to try to randomly find it in a simpler problem.
The text was updated successfully, but these errors were encountered:
Comment by bob-carpenter Thursday Jul 07, 2016 at 01:24 GMT
Thanks for reporting. It doesn't look like anyone has responded. Did the comparison with the other L-BFGS implementation use the same transformations implied by the constrained parameters?
I marked this a feature and a bug, at least until someone does some more investigating.
Comment by steven-murray Thursday Jul 07, 2016 at 01:45 GMT
So for the other L-BFGS implementation, I believe there is no transformation performed. In that case, it seems to evaluate first p0 which you give it, and then explicitly evaluates the edges, then works its way in (sorry I'm not too well acquainted with the exact procedure).
I would love to be more helpful in this (constructing a more minimal WE and doing a proper comparison), but I haven't really got the time at the moment, as is frustratingly often the case. I'm more than happy to trial things for you on my end though.
Comment by bob-carpenter Thursday Jul 07, 2016 at 01:58 GMT
You can try removing the bounds from the Stan program and
initializing at a value in support and see if that works
better. Maybe not if the problem was boundary values.
It'd be interesting to see if the Stan model would be
optimizable by the other program or if it's truly a problem
with our L-BFGS implementation not being robust enough.
So for the other L-BFGS implementation, I believe there is no transformation performed. In that case, it seems to evaluate first p0 which you give it, and then explicitly evaluates the edges, then works its way in (sorry I'm not too well acquainted with the exact procedure).
I would love to be more helpful in this (constructing a more minimal WE and doing a proper comparison), but I haven't really got the time at the moment, as is frustratingly often the case. I'm more than happy to trial things for you on my end though.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
Issue by steven-murray
Friday Jun 17, 2016 at 05:06 GMT
Originally opened as stan-dev/stan#1928
Hi
I'm using pystan version 2.9.0. I have a model based on a truncated incomplete gamma distribution.
When running optimizing(), the results depend very heavily on which seed I choose, and it seems to have something to do with the boundary values. That is, for one seed (121010), the optimization terminates after 2 iterations with all parameters at their boundaries (there are four parameters). I was firstly getting "Convergence detected: gradient norm is below tolerance", then when I set
tol_grad=0
, I got "relative gradient magnitude is below tolerance" and then when I settol_rel_grad=0
I got "absolute change in objective function was below tolerance". I fear setting anything more to zero will mean there's nothing to terminate the iterations.When I set the algorithm to "Newton" it works (doesn't seem to hit the boundaries). When setting to "BFGS" it has the same outcome as LBFGS.
When using seed of 1234 (without any of those options specified), the result comes out nicely.
I've verified that the lp__ values are correct at several of the iterations (modulo constant).
The LBFGS method seems to work every time when using the function from scipy.
I can list my stan code:
I am using data with ~2e5 variates:
samples.txt
from a distribution with Expected Values:
'logHs': 14.294849118059398,
'alpha': -1.868803779752156,
'beta': 0.6976327799172357,
'lnA': -42.58691124225525
I set the bounds as
logHs -- (13.5,15.5)
alpha -- (-1.99,-1.7)
beta -- (0.4,1.2)
lnA -- (-48,-37).
and V is 45794539.9853329.
I was previously setting the bounds to be wider on each parameter, but contracting the support has not solved the problem.
I'm hoping I've provided enough information. Unfortunately I haven't tested if this happens for simpler problems -- so I don't know if it is problem-specific, or a more general problem within Stan. Since it doesn't always happen for my own problem, I would be hesitant to try to randomly find it in a simpler problem.
The text was updated successfully, but these errors were encountered: