-
Notifications
You must be signed in to change notification settings - Fork 425
Add Posterior Standard Deviation acquisition function #2060
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Posterior Standard Deviation acquisition function #2060
Conversation
Hi @pjpollot! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks! |
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
Thanks for the PR - are there any results on / benchmarks for this acquisition function or suggestions how to best use it? I want to make sure that everything we expose in the core package has been evaluated rigorously and that users can have (or at least develop, e.g. from reading papers) if and when to use a particular acquisition function. Also, looks like there are a bunch of auto-formatting changes that snuck into the commit? |
@Balandat
No, nothing I could present as of now to promote the use of this acquisition function, but I wish to be able to show you some promising results once my research on the subject is done.
Yes, I just run the command |
I see. I'd like to see some of these results / pointers before merging this in.
Hmm interesting, maybe there was a recent |
Codecov Report
@@ Coverage Diff @@
## main #2060 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 179 179
Lines 15851 15859 +8
=========================================
+ Hits 15851 15859 +8
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@Balandat |
Pierre-Jean, thanks for the PR!
No need to use your own data—you can just run it on a couple of the
synthetic benchmark functions like hartman6 already in botorch! It’s much
simpler and more replicable this way anyways. In general I’d highly
recommend such a validation pipeline for any new active learning or BO
procedures before using them with real data.
E
…On Thu, Oct 19, 2023 at 9:15 PM Pierre-Jean Pollot ***@***.***> wrote:
@Balandat <https://github.com/Balandat>
No problem, I will finish my research first and get back to you with some
results!
The ufmt version I am using is 2.3.0.
—
Reply to this email directly, view it on GitHub
<#2060 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAW34MWFVB3Z5BK2ET7NADYAHGB3AVCNFSM6AAAAAA6GXV3IGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZRHEYTSMBVG4>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Main script: import json
from typing import Any
from warnings import filterwarnings
import torch
from botorch.acquisition import PosteriorStandardDeviation
from botorch.fit import fit_gpytorch_model
from botorch.models import SingleTaskGP
from botorch.optim import optimize_acqf
from botorch.test_functions import Ackley, StyblinskiTang
from botorch.test_functions.synthetic import SyntheticTestFunction
from botorch.utils.transforms import unnormalize
from gpytorch.kernels import MaternKernel, ScaleKernel
from gpytorch.mlls import ExactMarginalLogLikelihood
from torch import Tensor
class Benchmark1D:
def __init__(self, synthetic_function: SyntheticTestFunction) -> None:
assert synthetic_function.dim == 1, "Benchmark1D is only for 1D functions"
self._func = synthetic_function
self._name = synthetic_function.__class__.__name__.lower() + "1d"
@property
def name(self) -> str:
return self._name
@property
def bounds(self) -> Tensor:
return self._func.bounds
def __call__(self, x: Tensor) -> Tensor:
return self._func(x).unsqueeze(-1)
def sample(self, n: int) -> tuple[Tensor, Tensor]:
z = torch.rand(n, 1)
x = unnormalize(z, self.bounds)
return x, self(x)
def linspace(self, n: int) -> tuple[Tensor, Tensor]:
z = torch.linspace(0, 1, n).unsqueeze(-1)
x = unnormalize(z, self.bounds)
return x, self(x)
def pure_exploration(model: SingleTaskGP, bounds: Tensor) -> Tensor:
acq_func = PosteriorStandardDeviation(model)
return optimize_acqf(
acq_function=acq_func,
bounds=bounds,
q=1,
num_restarts=10,
raw_samples=512,
)[0]
def build_train_gp_model(x: Tensor, y: Tensor) -> SingleTaskGP:
model = SingleTaskGP(
train_X=x,
train_Y=y,
covar_module=ScaleKernel(MaternKernel(nu=1.5)),
)
mll = ExactMarginalLogLikelihood(model.likelihood, model)
fit_gpytorch_model(mll)
return model
def to_list(t: Tensor) -> list:
return t.detach().squeeze().tolist()
N_INIT = 2
MAX_SIZE = 100
SYNTHETIC_FUNCTIONS = [
Ackley(dim=1),
StyblinskiTang(dim=1, negate=True),
]
if __name__ == "__main__":
filterwarnings("ignore")
res = {}
for function in SYNTHETIC_FUNCTIONS:
torch.manual_seed(0)
benchmark = Benchmark1D(function)
print(benchmark.name)
x, y = benchmark.sample(N_INIT)
x_test, y_test = benchmark.linspace(1000)
res[benchmark.name] = {
"test_x": to_list(x_test),
"test_y": to_list(y_test),
"loop": [],
}
while x.shape[0] < MAX_SIZE:
if x.shape[0] % 10 == 0:
print(x.shape[0], "/", MAX_SIZE)
model = build_train_gp_model(x, y)
x_next = pure_exploration(model, benchmark.bounds)
y_next = benchmark(x_next)
posterior = model(x_test)
mean = posterior.mean
rmse = compute_rmse(y_test, mean)
std = posterior.stddev
bottom, top = posterior.confidence_region()
res[benchmark.name]["loop"].append(
{
"x": to_list(x),
"y": to_list(y),
"x_next": to_list(x_next),
"mean": to_list(mean),
"std": to_list(std),
"bottom": to_list(bottom),
"top": to_list(top),
}
)
x = torch.cat([x, x_next])
y = torch.cat([y, y_next])
with open("results.json", "w") as f:
json.dump(res, f, indent=2) |
Visualization script: import json
import os
import matplotlib.pyplot as plt
import numpy as np
if __name__ == "__main__":
plt.style.use("ggplot")
with open("results.json", "r") as f:
res = json.load(f)
for benchmark_name, benchmark_res in res.items():
print(benchmark_name)
folder = f"graphs/{benchmark_name}"
os.makedirs(folder, exist_ok=True)
for i, loop in enumerate(benchmark_res["loop"]):
fig, (ax1, ax2) = plt.subplots(
nrows=2, figsize=(5, 10), sharex=True, constrained_layout=True
)
fig.suptitle(benchmark_name)
ax1.plot(
benchmark_res["test_x"], benchmark_res["test_y"], "--", label="true"
)
ax1.plot(benchmark_res["test_x"], loop["mean"], label="predicted")
ax1.fill_between(
benchmark_res["test_x"],
loop["bottom"],
loop["top"],
alpha=0.5,
)
ax1.scatter(loop["x"], loop["y"], label="training data")
ax1.axvline(loop["x_next"], color="black", label="next query point")
ax1.fill_between(
benchmark_res["test_x"],
loop["bottom"],
loop["top"],
alpha=0.5,
label="confidence region",
)
ax2.plot(benchmark_res["test_x"], loop["std"])
ax2.axvline(loop["x_next"], color="black")
ax1.set_title(f"{len(loop['x'])} training points")
ax1.legend(loc="lower left")
ax1.set_ylabel("y")
ax2.set_ylabel("std")
ax2.set_xlabel("x")
fig.savefig(os.path.join(folder, f"{i+1}.png"))
plt.close(fig) |
This is very related to an extensive previous discussion on maximum variance acquisition functions for active learning (#1366 ). A |
This is a very simple acquisition function that is implemented in a way to require very minimal maintenance going forward. Given that is was requested more than once, I think we should land this. @pjpollot The tutorial failure should be resolved if you rebase the changes. There are just some lint failures that needs to be addressed before we can merge this in. It might be easiest to revert the changes to the files that were not affected by the original implementation. |
@saitcakmak has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for putting this in! I'm happy to approve this once the formatting changes to irrelevant files are reverted.
92f6ea5
to
68f352f
Compare
@eytan @esantorella @saitcakmak |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
4b6933b
to
62ddbdb
Compare
@saitcakmak has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@saitcakmak merged this pull request in fc6fdba. |
|
||
Example: | ||
>>> model = SingleTaskGP(train_X, train_Y) | ||
>>> PSTD = PosteriorMean(model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be PosteriorStandardDeviation
right? I'll put in a quick fix.
…er documentation (#2071) Summary: ## Motivation * #2060 introduced a typo in a docstring where `PosteriorMean` was used instead of `PosteriorStandardDeviation` * The `SingleTaskGP` and "Getting Started" documentation use single-precision data and non-standardized outcome data, which is not recommended usage and raises warnings. I changed the documentation to use an outcome transform and double-precision data. ### Have you read the [Contributing Guidelines on pull requests](https://github.com/pytorch/botorch/blob/main/CONTRIBUTING.md#pull-requests)? Yes Pull Request resolved: #2071 Test Plan: N/A ## Related PRs #2060 Reviewed By: Balandat Differential Revision: D50647222 Pulled By: esantorella fbshipit-source-id: 65c730a35012ee1687a24b06f710963ce64db218
Motivation
I am a machine learning engineer actively researching Bayesian optimization solutions to apply in my company's products. Lately, I've been trying to find the right balance between exploitation and exploration by incorporating pure exploration into a sequential batch Bayesian optimization algorithm.
qNegIntegratedPosteriorVariance
was a suitable choice for this purpose, but it proved a bit slower when compared to the Posterior Standard Deviation acquisition function that I'm introducing in this pull request.The acquisition function simply returns the posterior standard deviation of the Gaussian process model, so the time complexity remains relatively low when compared to a Monte Carlo acquisition function.
Have you read the Contributing Guidelines on pull requests?
Yes.
Test Plan
PosteriorStandardDeviation
class I implemented is extremely similar toPosteriorMean
, so the implementation and the unit tests also look almost the same. I just made sure to define a variance forMockPosterior
in order to effectively run the unit tests.As I may have some doubts about batch unit testing in order to verify if my solution only applies for
q=1
, I would be glad to hear it from you about my current implementation!PS: I guess there is no problem with the integration of the new acquisition function in the documentation! ⬇️
Related PRs
(If this PR adds or changes functionality, please take some time to update the docs at https://github.com/pytorch/botorch, and link to your PR here.)