Skip to content

Add Posterior Standard Deviation acquisition function #2060

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

pjpollot
Copy link
Contributor

@pjpollot pjpollot commented Oct 19, 2023

Motivation

I am a machine learning engineer actively researching Bayesian optimization solutions to apply in my company's products. Lately, I've been trying to find the right balance between exploitation and exploration by incorporating pure exploration into a sequential batch Bayesian optimization algorithm. qNegIntegratedPosteriorVariance was a suitable choice for this purpose, but it proved a bit slower when compared to the Posterior Standard Deviation acquisition function that I'm introducing in this pull request.

The acquisition function simply returns the posterior standard deviation of the Gaussian process model, so the time complexity remains relatively low when compared to a Monte Carlo acquisition function.

Have you read the Contributing Guidelines on pull requests?

Yes.

Test Plan

PosteriorStandardDeviation class I implemented is extremely similar to PosteriorMean, so the implementation and the unit tests also look almost the same. I just made sure to define a variance for MockPosterior in order to effectively run the unit tests.

As I may have some doubts about batch unit testing in order to verify if my solution only applies for q=1, I would be glad to hear it from you about my current implementation!

PS: I guess there is no problem with the integration of the new acquisition function in the documentation! ⬇️

doc screenshot

Related PRs

(If this PR adds or changes functionality, please take some time to update the docs at https://github.com/pytorch/botorch, and link to your PR here.)

@facebook-github-bot
Copy link
Contributor

Hi @pjpollot!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Oct 19, 2023
@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@Balandat
Copy link
Contributor

Thanks for the PR - are there any results on / benchmarks for this acquisition function or suggestions how to best use it? I want to make sure that everything we expose in the core package has been evaluated rigorously and that users can have (or at least develop, e.g. from reading papers) if and when to use a particular acquisition function.

Also, looks like there are a bunch of auto-formatting changes that snuck into the commit?

@pjpollot
Copy link
Contributor Author

pjpollot commented Oct 20, 2023

@Balandat
Thank you for your quick reply!

are there any results on / benchmarks for this acquisition function or suggestions how to best use it?

No, nothing I could present as of now to promote the use of this acquisition function, but I wish to be able to show you some promising results once my research on the subject is done.

Also, looks like there are a bunch of auto-formatting changes that snuck into the commit?

Yes, I just run the command ufmt format . like suggested in CONTRIBUTING.md, but surprisingly several files other than the ones I modified got changed as well.

@Balandat
Copy link
Contributor

No, nothing I could present as of now to promote the use of this acquisition function, but I wish to be able to show you some promising results once my research on the subject is done.

I see. I'd like to see some of these results / pointers before merging this in.

Yes, I just run the command ufmt format . like suggested in CONTRIBUTING.md, but surprisingly several files other than the ones I modified got changed as well.

Hmm interesting, maybe there was a recent ufmt version bump. What ufmt version is installed in your environment?

@codecov
Copy link

codecov bot commented Oct 20, 2023

Codecov Report

Merging #2060 (eb7cf56) into main (58da970) will not change coverage.
The diff coverage is 100.00%.

❗ Current head eb7cf56 differs from pull request most recent head 96b4bf1. Consider uploading reports for the commit 96b4bf1 to get more accurate results

@@            Coverage Diff            @@
##              main     #2060   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files          179       179           
  Lines        15851     15859    +8     
=========================================
+ Hits         15851     15859    +8     
Files Coverage Δ
botorch/acquisition/__init__.py 100.00% <ø> (ø)
botorch/acquisition/analytic.py 100.00% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@pjpollot
Copy link
Contributor Author

@Balandat
No problem, I will finish my research first and get back to you with some results!
The ufmt version I am using is 2.3.0.

@eytan
Copy link
Contributor

eytan commented Oct 20, 2023 via email

@pjpollot
Copy link
Contributor Author

pjpollot commented Oct 21, 2023

Thank you very much for your feedbacks.

I send you below the scripts and graphs for Ackley1D and (negate) Styblinski-tang1D to visualize the behavior of PosteriorStandardDeviation acquisition function.

ackley1d

styblinskitang1d
torch/botorch/assets/47068641/3ab835fc-c050-4351-b018-295a8c2b1ec8)

@pjpollot
Copy link
Contributor Author

pjpollot commented Oct 21, 2023

Main script:

import json

from typing import Any

from warnings import filterwarnings

import torch

from botorch.acquisition import PosteriorStandardDeviation
from botorch.fit import fit_gpytorch_model
from botorch.models import SingleTaskGP
from botorch.optim import optimize_acqf
from botorch.test_functions import Ackley, StyblinskiTang
from botorch.test_functions.synthetic import SyntheticTestFunction
from botorch.utils.transforms import unnormalize
from gpytorch.kernels import MaternKernel, ScaleKernel

from gpytorch.mlls import ExactMarginalLogLikelihood
from torch import Tensor


class Benchmark1D:
    def __init__(self, synthetic_function: SyntheticTestFunction) -> None:
        assert synthetic_function.dim == 1, "Benchmark1D is only for 1D functions"
        self._func = synthetic_function
        self._name = synthetic_function.__class__.__name__.lower() + "1d"

    @property
    def name(self) -> str:
        return self._name

    @property
    def bounds(self) -> Tensor:
        return self._func.bounds

    def __call__(self, x: Tensor) -> Tensor:
        return self._func(x).unsqueeze(-1)

    def sample(self, n: int) -> tuple[Tensor, Tensor]:
        z = torch.rand(n, 1)
        x = unnormalize(z, self.bounds)
        return x, self(x)

    def linspace(self, n: int) -> tuple[Tensor, Tensor]:
        z = torch.linspace(0, 1, n).unsqueeze(-1)
        x = unnormalize(z, self.bounds)
        return x, self(x)


def pure_exploration(model: SingleTaskGP, bounds: Tensor) -> Tensor:
    acq_func = PosteriorStandardDeviation(model)
    return optimize_acqf(
        acq_function=acq_func,
        bounds=bounds,
        q=1,
        num_restarts=10,
        raw_samples=512,
    )[0]


def build_train_gp_model(x: Tensor, y: Tensor) -> SingleTaskGP:
    model = SingleTaskGP(
        train_X=x,
        train_Y=y,
        covar_module=ScaleKernel(MaternKernel(nu=1.5)),
    )
    mll = ExactMarginalLogLikelihood(model.likelihood, model)
    fit_gpytorch_model(mll)
    return model

def to_list(t: Tensor) -> list:
    return t.detach().squeeze().tolist()


N_INIT = 2
MAX_SIZE = 100
SYNTHETIC_FUNCTIONS = [
    Ackley(dim=1),
    StyblinskiTang(dim=1, negate=True),
]

if __name__ == "__main__":
    filterwarnings("ignore")
    res = {}
    for function in SYNTHETIC_FUNCTIONS:
        torch.manual_seed(0)
        benchmark = Benchmark1D(function)
        print(benchmark.name)
        x, y = benchmark.sample(N_INIT)
        x_test, y_test = benchmark.linspace(1000)
        res[benchmark.name] = {
            "test_x": to_list(x_test),
            "test_y": to_list(y_test),
            "loop": [],
        }
        while x.shape[0] < MAX_SIZE:
            if x.shape[0] % 10 == 0:
                print(x.shape[0], "/", MAX_SIZE)
            model = build_train_gp_model(x, y)
            x_next = pure_exploration(model, benchmark.bounds)
            y_next = benchmark(x_next)
            posterior = model(x_test)
            mean = posterior.mean
            rmse = compute_rmse(y_test, mean)
            std = posterior.stddev
            bottom, top = posterior.confidence_region()
            res[benchmark.name]["loop"].append(
                {
                    "x": to_list(x),
                    "y": to_list(y),
                    "x_next": to_list(x_next),
                    "mean": to_list(mean),
                    "std": to_list(std),
                    "bottom": to_list(bottom),
                    "top": to_list(top),
                }
            )
            x = torch.cat([x, x_next])
            y = torch.cat([y, y_next])

        with open("results.json", "w") as f:
            json.dump(res, f, indent=2)

@pjpollot
Copy link
Contributor Author

Visualization script:

import json
import os

import matplotlib.pyplot as plt
import numpy as np

if __name__ == "__main__":
    plt.style.use("ggplot")
    with open("results.json", "r") as f:
        res = json.load(f)
    for benchmark_name, benchmark_res in res.items():
        print(benchmark_name)
        folder = f"graphs/{benchmark_name}"
        os.makedirs(folder, exist_ok=True)
        for i, loop in enumerate(benchmark_res["loop"]):
            fig, (ax1, ax2) = plt.subplots(
                nrows=2, figsize=(5, 10), sharex=True, constrained_layout=True
            )
            fig.suptitle(benchmark_name)
            ax1.plot(
                benchmark_res["test_x"], benchmark_res["test_y"], "--", label="true"
            )
            ax1.plot(benchmark_res["test_x"], loop["mean"], label="predicted")
            ax1.fill_between(
                benchmark_res["test_x"],
                loop["bottom"],
                loop["top"],
                alpha=0.5,
            )
            ax1.scatter(loop["x"], loop["y"], label="training data")
            ax1.axvline(loop["x_next"], color="black", label="next query point")
            ax1.fill_between(
                benchmark_res["test_x"],
                loop["bottom"],
                loop["top"],
                alpha=0.5,
                label="confidence region",
            )
            ax2.plot(benchmark_res["test_x"], loop["std"])
            ax2.axvline(loop["x_next"], color="black")
            ax1.set_title(f"{len(loop['x'])} training points")
            ax1.legend(loc="lower left")
            ax1.set_ylabel("y")
            ax2.set_ylabel("std")
            ax2.set_xlabel("x")
            fig.savefig(os.path.join(folder, f"{i+1}.png"))
            plt.close(fig)

@esantorella
Copy link
Member

This is very related to an extensive previous discussion on maximum variance acquisition functions for active learning (#1366 ). A (q)MaximumVariance acquisition function was proposed there. The discussion was a bit long and hard to summarize, but I think there was some tension between wanting to test that maximum variance works well before incorporating it vs. wanting to have it available regardless since it's commonly used as a baseline in active learning applications. I propose consolidating discussion on #1366 .

@saitcakmak
Copy link
Contributor

This is a very simple acquisition function that is implemented in a way to require very minimal maintenance going forward. Given that is was requested more than once, I think we should land this.

@pjpollot The tutorial failure should be resolved if you rebase the changes. There are just some lint failures that needs to be addressed before we can merge this in. It might be easiest to revert the changes to the files that were not affected by the original implementation.

@facebook-github-bot
Copy link
Contributor

@saitcakmak has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Member

@esantorella esantorella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for putting this in! I'm happy to approve this once the formatting changes to irrelevant files are reverted.

@pjpollot pjpollot force-pushed the add_posterior_std_acquisition_func branch from 92f6ea5 to 68f352f Compare October 24, 2023 07:33
@pjpollot
Copy link
Contributor Author

@eytan @esantorella @saitcakmak
Thank you everyone for your precious feedbacks!
I made all the changes requested, so may I ask you again a quick review of my code?
Thank you in advance.

Copy link
Member

@esantorella esantorella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@pjpollot pjpollot force-pushed the add_posterior_std_acquisition_func branch from 4b6933b to 62ddbdb Compare October 25, 2023 01:00
@facebook-github-bot
Copy link
Contributor

@saitcakmak has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@saitcakmak merged this pull request in fc6fdba.

@pjpollot pjpollot deleted the add_posterior_std_acquisition_func branch October 25, 2023 07:40

Example:
>>> model = SingleTaskGP(train_X, train_Y)
>>> PSTD = PosteriorMean(model)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be PosteriorStandardDeviation right? I'll put in a quick fix.

facebook-github-bot pushed a commit that referenced this pull request Oct 25, 2023
…er documentation (#2071)

Summary:
## Motivation

* #2060 introduced a typo in a docstring where `PosteriorMean` was used instead of `PosteriorStandardDeviation`
* The `SingleTaskGP` and "Getting Started" documentation use single-precision data and non-standardized outcome data, which is not recommended usage and raises warnings. I changed the documentation to use an outcome transform and double-precision data.

### Have you read the [Contributing Guidelines on pull requests](https://github.com/pytorch/botorch/blob/main/CONTRIBUTING.md#pull-requests)?

Yes

Pull Request resolved: #2071

Test Plan:
N/A

## Related PRs

#2060

Reviewed By: Balandat

Differential Revision: D50647222

Pulled By: esantorella

fbshipit-source-id: 65c730a35012ee1687a24b06f710963ce64db218
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants