-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimization after refactoring to AD in 0.30 - example failing #40
Comments
@pavhofman Thanks for the good point! fn quad(x: &Vec<f64>, n: Vec<AD>) -> Option<Vec<AD>> {
Some(
x.clone().into_iter()
.map(|t| AD::from(t))
.map(|t| t.pow(n[0]))
.collect()
)
} See But we can optimize with below function! fn quad(x: &Vec<f64>, n: Vec<AD>) -> Option<Vec<AD>> {
Some(
x.clone().into_iter()
.map(|t| AD1(t, 0f64))
.map(|t| t.pow(n[0]))
.collect()
)
} Because of the type BTW, which syntax seems better? If you choose second one or suggest any other idea, then I'll implement it in next version. // First one - Current Version (`0.30.6`)
fn quad(x: &Vec<f64>, n: Vec<AD>) -> Option<Vec<AD>> {
Some(
x.clone().into_iter()
.map(|t| AD1(t, 0f64))
.map(|t| t.pow(n[0]))
.collect()
)
} // Second one - Candidate of future version
fn quad(x: &Vec<f64>, n: Vec<AD>) -> Option<Vec<AD>> {
Some(
x.clone().into_iter()
.map(|t| t.to_ad1())
.map(|t| t.pow(n[0]))
.collect()
)
} |
Hi, thank you very much for very fast fix. I will try it tomorrow. As of the choices - I would probably prefer the t.to_ad1() version as it does not require any extra constant in the API. I am a newbie to Rust. Today I tried hard (and always failed) to write a closure which would define the optimized function in one place - to be used both for calculating the measured or precise value and for running the optimization - t.pow(n[0]) in your example. So far all of my optimization uses (in Octave/Matlab) involved other variables too which means a closure instead of a function (the same in octave too). Please would that be possible to consider an option or API for using one function/closure for this // Noise to image and this: fn quad(x: &Vec, n: Vec) -> Option<Vec> { Just the part inside the iteration. Often the optimized functions are very long and having them at two places is a bit inconvenient and error-prone. Thanks a lot for your great effort. With regards, Pavel. |
Thanks for great suggestion. I also felt inconvenience of optimization in peroxide. In fact, the optimized function in one place can be achieved by generic function. #[macro_use]
extern crate peroxide;
use peroxide::fuga::*;
fn main() {
// ...
// Noise to image
let mut y = x.fmap(|t| pow_fn(t, &vec![2f64])); // <---- use pow_fn
// ...
// Optimizer setting
let mut opt = Optimizer::new(data, pow_optim);
// ...
}
// One generic function (`Real` = `f64` or `AD`)
fn pow_fn<T: Real>(x: T, n: &Vec<T>) -> T {
x.pow(n[0])
}
// Boiler plate to use optimize
fn pow_optim(x: &Vec<f64>, n: Vec<AD>) -> Option<Vec<AD>> {
Some(
x.clone().into_iter()
.map(|t| AD1(t, 0f64))
.map(|t| pow_fn(t, &n)) // <---- use pow_fn
.collect()
)
} But it seems uncomfortable because of boiler plate. So, there should be improvement of API. Maybe next code seems more comfortable. // Candidate
fn main() {
// ...
// Optimizer setting
let mut opt = Optimizer::new(data, pow_fn_optim);
}
#[optimize]
fn pow_fn<T: Real>(x: T, n: &Vec<T>) -> T {
x.pow(n[0])
}
I'll implement above functionality in next version. (Maybe Thank you once again for great suggestions! 😄 |
Hi, thanks a lot for your great support. The updated example works very good now. I am just afraid the move to AD has introduced some extra wrapping in the critical path which affects performance. This is comparison of your example in 0.29 and 0.30 for the 100 points as in the example (time measured by measure_time crate just for the optimization run + printout: {
print_time!("{:?}", "Regression 0.30");
let p = opt.set_init_param(n_init)
.set_max_iter(200)
.set_method(LevenbergMarquardt)
.optimize();
p.print(); // Optimized parameter
opt.get_error().print(); // Optimized RMSE
}
And for 10,000 points (let mut x = seq(0., 99., 0.01f64);):
Please is there any way to do something about the performance regression? I am afraid it is a major slowdown, especially for the shorter runs. My application is embedded/soft real-time and every ms counts, hence the choice of rust :-) Thank you very much for your suggestion of the generic function. IIUC there are no generic closures in Rust. The problem is all my optimization functions require additional (fixed, non-optimized) parameters which I do not know how to pass without using a closure instead of a function. I am sorry for posting so many questions :-) Thanks a lot for your patience. Best regards, Pavel. |
Maybe there are four ways to achieve high performance.
// Current version
fn quad(x: &Vec<f64>, n: Vec<AD>) -> Option<Vec<AD>> {
Some(
x.clone().into_iter()
.map(|t| pow_temp(t, n[0]))
.collect()
)
}
#[inline]
fn pow_temp(x: f64, y: AD) -> AD {
AD1(x.powf(y.x()), x.powf(y.x()) * x.ln() * y.dx())
} For the next version, you will be able to use below code. // Future version
fn quad(x: &Vec<f64>, n: Vec<AD>) -> Option<Vec<AD>> {
Some(
x.clone().into_iter()
.map(|t| t.pow(n[0])) // <-- Direct computation
.collect()
)
}
Thanks for great benchmark. The results have alarmed me. I will think about performance more seriously. |
Thanks! Regarding the jacobian - would that be possible to enter the jacobian matrix externally just like the optimized function, i.e. a closure? From what I have seen most engines allow that - e.g. optimset ('dfdp', fDfdp) in octave, gradient in optimization-engine https://github.com/alphaville/optimization-engine/blob/master/examples/panoc_ex1.rs#L30 etc. Automated calculation as default, but with an option of user-provided function/closure. From my POW the compilation time is not critical but runtime is. Cross-compiled on a powerful PC, running on an embedded ARM. Other users may have it reversed but probably not. I think I found a way to make the quad function a closure and use it for both creating measured values and running the optimization. The one-time conversion from AD when generating y is no overhead. let gain = 5.0;
let quad_closure = |x: &Vec<f64>, n: Vec<AD>| -> Option<Vec<AD>> {
Some(
x.clone().into_iter()
.map(|t| AD1(t, 0f64))
.map(|t| gain * t.pow(n[0]))
.collect()
)
};
// Noise to image
let n = vec![AD0(2.0)];
let mut y = quad_closure(&x, n).unwrap().into_iter().map(|t| t.x()).collect();
....
let mut opt = Optimizer::new(data, quad_closure); |
Actually I am comparing re-compilation times of my example in v. 0.28.1 and v. 0.30.6 and do not notice any difference. The crate itself is compiled only once, no problem waiting a bit, IMO.
|
As of the iteration runs - some engines have an optional parameter (+ some tiny default) for a minimum change ratio of the parameter between iterations ((e.g. less than 0.1%, or even as a vector, value for each params) and if an iteration produces change in all params below their minimum change ratio, it stops. Typically the convergence is much faster than 50 runs - below 20 in your example. Perhaps that could be a way towards a major speedup. Estimating the required number of iterations upfront is difficult, IMO. Sorry for taking so much of your time :-) |
I have reread my posts - it sounds like I am just requesting features instead of sending patches. I have just started with rust but I can try adding e.g. the minimum change option and send a PR for review/rework, if you agree. Please just let me know if your planned changes are done so that I could start with the latest code. Thanks a lot, I very much appreciate your kind help and effort. |
Good suggestion! For future version, manual gradient function(or closure) can be allowed. (Unfortunately, in current version, it's impossible)
That is because of implementation. Until
Oh.. I forgot implementation of stop condition. I already implemented that feature in ode which has similar api. That is a good point!
Don't be, these discussions also meaningful to me 😄
Just requesting features is also helpful enough for Thanks for your great interests & suggestions! 👍 |
Hi, thanks a lot for the great library.
I tried running the example in optimization.rs in 0.30.6 and it does not optimize, keeping the initial value in p output. Using 0.29.1 - its example (i.e. before refactoring from numbers to AD) works OK.
Thanks a lot,
Pavel.
The text was updated successfully, but these errors were encountered: