-
Notifications
You must be signed in to change notification settings - Fork 427
Make (Log)NoisyExpectedImprovement create a correct fantasy model with non-default SingleTaskGP #2414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for reporting this issue and for doing so much of the work towards fixing it! I'm not sure the handling of transforms is right -- see comments. And if you don't have time to get this PR past the finish line yourself, just let us know.
Hi thanks for reviewing my pull request. I don't know what you mean by "see comments" - I don't see any comments. I think that I won't have the time or ability to finish this any time in the near future. |
Ah, I'm sorry, I failed to submit my comments yesterday. No worries if you don't have time to work on this; I just wanted to check on what you intentions are. I think we'd want to test combining transforms with (Log)NEI by comparing against the behavior without transforms. For example, if we evaluate LogNEI on a |
I fixed the problems now and all the tests pass. So I think it is ready to be committed to main. |
@esantorella There's still conflicts. I'm not sure how to merge them correctly though (either through GitHub or locally on command line). Any help would be appreciated! |
test/acquisition/test_analytic.py
Outdated
# Same as the default Matern kernel | ||
# botorch.models.utils.gpytorch_modules.get_matern_kernel_with_gamma_prior, | ||
# except RBFKernel is used instead of MaternKernel. | ||
# For some reason, RBF gives numerical problems but Matern does not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which numerical problems did you run into? The RBF kernel is generally smoother as the Matern kernel, meaning its eigen-spectrum decays faster, which implies that associated covariance matrices are more likely to be numerically low rank than for the Matern. Evaluating the kernel on fewer points, or alternatively decreasing its lengthscale should make this type of numerical issue go away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean increasing the lengthscale right? But in any case, I think that would be too much of a pain because firstly, NEI_NOISE
is fixed at 10 values so it's not straightforward to decrease the number of points. And secondly, the lengthscale really matters for making the tests work; whoever made this test I assume knew what they were doing and I don't want to arbitrarily fudge numbers (namely, the lengthscale) too much because I saw that that can make the tests fail. Why not just keep it how it is?
How about this, if we really don't want those numerical warnings: test Matern on both float32 and float64, but only test RBF on float64?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean increasing the lengthscale right?
Increasing the lengthscale will tend to exacerbate the numerical ill conditioning of the RBF kernel matrix, decreasing the lengthscale will help it. You can think of the lengthscale as controlling the potential complexity of the function, and functions with larger lengthscales are smoother, i.e. less complex, than functions with shorter lengthscales, which can exhibit many more variations. This in turn shows up as an increase in the numerical rank for more complex functions, which helps the conditioning.
In the most extreme case, a lengthscale of zero would imply that all points are independent, i.e. the covariance would be diagonal, and in the case of the unscaled RBF kernel, it'd be the identity, which has perfect conditioning.
but only test RBF on float64
Sure!
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2414 +/- ##
=======================================
Coverage 99.98% 99.98%
=======================================
Files 189 189
Lines 16685 16691 +6
=======================================
+ Hits 16683 16689 +6
Misses 2 2 ☔ View full report in Codecov by Sentry. |
@SebastianAment has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
This looks good to me, thanks! I just imported the PR, which will run some additional integration tests. Two remaining items to get this in:
|
@SebastianAment This PR used to be in draft mode but I don't see anything on this page about it being in draft mode; it looks to me like it's not in draft mode anymore. |
Yes same here, maybe it was changed automatically by the import. |
|
||
# Could pass in the outcome_transform and input_transform here, | ||
# however that would make them be applied in SingleTaskGP.__init__ which is | ||
# unnecessary. So we will instead set them afterwards. | ||
fantasy_model = SingleTaskGP( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change would allow for supporting more model types. However, @saitcakmak had a good question: Why do we need _get_noiseless_fantasy_model
at all? Can we use the fantasize
method on the model instead? I'm a bit afraid of the change I'm suggesting, since this instantiation logic won't be right for every model.
fantasy_model = SingleTaskGP( | |
fantasy_model = cls(model)( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it'd be a great simplification (& removal of duplicate logic) if we can simply use model.fantasize(...)
rather than defining a custom _get_noiseless_fantasy_model
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you mean to write model.__class__(...)
or type(model)(...)
; cls(model)
is not a thing in Python.
But anyway, how about just not making that change because the model is currently assumed to be a SingleTaskGP
anyway, and if we can find a way to use fantasize
for a wider variety of models later, then just do that.
Thanks! This looks good to me. I checked the acquisition values with and without having transforms on the model here and see that the transforms are being handled correctly -- you can't see all the (Log)NEI lines here because they're all on top of each other. And they're all reasonably close to qLogNEI, which hasn't changed. One thing I don't quite understand is why fantasization needs to happen this way instead of using the model's ![]() |
botorch/acquisition/analytic.py
Outdated
@@ -608,7 +607,7 @@ class LogNoisyExpectedImprovement(AnalyticAcquisitionFunction): | |||
|
|||
def __init__( | |||
self, | |||
model: GPyTorchModel, | |||
model: SingleTaskGP, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although it's a very ugly hack, I'd suggest leaving this as GPyTorchModel
and raising an UnsupportedError
when the model is something other than a SingleTaskGP
, because this will create a lot of downstream typecheck errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK yeah, that's what's currently done in _get_noiseless_fantasy_model
. Which is, like you say, hacky since it says that the type is GPyTorchModel
.
This makes me think of another thing, which is, SingleTaskGP
can be multiple-outcome, but in LogNoisyExpectedImprovement
and NoisyExpectedImprovement
it is assumed to be single-outcome, but there is no explicit check for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then, there's the fact that all the other classes in analytic.py
, and probably most of the BoTorch code, it looks like doesn't explicitly check that the models are single-outcome, even if it says in the docstring that it should be. I guess with some coding styles then you don't always do explicit checks in the code, so it's not like this is a necessity...
…odel parameter type clearer in description of model parameter in addition to the current note; added explicit check to make sure that the model is single-outcome
Co-authored-by: Elizabeth Santorella <elizabeth.santorella@gmail.com>
@SebastianAment has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@SebastianAment merged this pull request in 25506ab. |
Motivation
In
botorch/acquisition/analytic.py
, theLogNoisyExpectedImprovement
andNoisyExpectedImprovement
use the function_get_noiseless_fantasy_model
in order to repeatedly sample from fantasy model. But_get_noiseless_fantasy_model
only works for default GP (i.e. with default Matern kernel) & also with no input or outcome transforms.I think that it would make sense if this code were written to work with any kind of
SingleTaskGP
, not just the default one with no input and outcome transforms.Have you read the Contributing Guidelines on pull requests?
Yes
Test Plan
Since the code is now meant to work even if there are input or outcome transforms or different covar_module or mean_module, I updated the test code to try all these things, as well as try different input bounds to make sure the input transform is working correctly. However, the tests now fail, specifically when either the input data range is not [0,1], or when the kernel is the RBF kernel (not Matern). I believe that the tests failing when RBF is used is simply because certain constants were used in the code that are only valid for particular GP settings. However, I think that when the code fails due to input range not being [0,1], this might be a slight problem with the code -- or might not -- I'm not completely sure.
I also added a line that makes sure that the state_dict() is the same between the original model and fantasy model.
Related PRs
I put this in an issue #2412 and was told that it is OK if not all the tests pass.