-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MCEI + DGP sampler #393
MCEI + DGP sampler #393
Conversation
trieste/acquisition/function.py
Outdated
|
||
tf.debugging.assert_positive(len(dataset)) | ||
|
||
samples_at_query_points = sampler.sample(dataset.query_points) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether it makes sense to have one sample_size
parameter for comuptiung eta where precision might matter more and one for sampling in mc_ei
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is risky, since for small sample sizes (which would be typical) it may be possible to end up with a case where the expected improvement is zero everywhere due to the different samples being used. @henrymoss do you have any thoughts?
trieste/acquisition/function.py
Outdated
|
||
eta = tf.reduce_min(mean, axis=0) | ||
|
||
def mc_ei(at: TensorType) -> TensorType: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for consistency with the rest of the code in function.py should we move this function outside of the class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried this now, if you could have a look that would be great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a lot of comments about this...
trieste/acquisition/function.py
Outdated
return tf.reduce_mean(improvement, axis=0) # [N, 1] | ||
|
||
return mc_ei | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking at the batch_monte_carlo_expected_improvement
it does define the update fnc to avoid retracing, to me it seems it should be straightforward to emulate this - @uri-granta what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is really possible since we want a different sampler for each time we acquire a new point, but I'm not really an expert on retracing. Any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @henrymoss and @uri-granta managed to find a way around this with the new design (self._initialized variable and reset_sampler method in ReparametrizationSampler)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My impression there is that it relies on the model not changing between updates, whereas here the reparam sampler (which takes the place of the model) does, but I'm not entirely sure about this. Would be good to have some input on the retracing side of things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I'll have another look!
sampler = function._sampler # type: ignore | ||
sampler.reset_sampler() | ||
samples_at_query_points = sampler.sample(dataset.query_points, jitter=self._jitter) | ||
mean = tf.reduce_mean(samples_at_query_points, axis=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you know how come in BatchMonteCarloExpectedImprovement
predict call is used for computing the mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think BatchMonteCarloExpectedImprovement
assumes that the mean can be computed analytically. Here it would probably be dangerous to do that, since for e.g. a DGP the predict
call will return a sample which wouldn't even necessarily be smooth.
Co-authored-by: Hrvoje Stojic <hrvoje.stojic@protonmail.com>
Co-authored-by: Hrvoje Stojic <hrvoje.stojic@protonmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some minor comments, but otherwise looks ready to go
Co-authored-by: Hrvoje Stojic <hrvoje.stojic@protonmail.com>
…into sebastian/mcei
No description provided.