Open
Description
Hi Emikit team,
First, thank you for your work on this package - it's a joy to use.
I'm writing with a question about some curious behavior I've observed when using the Bayesian optimization control loop. When I use the IModel.set_data(X, Y)
class method to alter the model data followed by the OuterLoop.get_next_points(results)
, the model's data is reset to what it was before the set_data()
call with an extra row representing the contents of the results
object.
The expected behavior is to see, after the OuterLoop.get_next_points(results)
call, the model data constituted by the X
passed to set_data
concatenated with the contents of results
.
Here's a minimal example that reproduces the behavior:
import numpy as np
from GPy.models import GPRegression
from GPy.kern import Matern52
from emukit.bayesian_optimization.acquisitions import ExpectedImprovement
from emukit.bayesian_optimization.loops import BayesianOptimizationLoop
from emukit.core import (
ParameterSpace,
DiscreteParameter,
)
from emukit.core.loop import UserFunctionWrapper
from emukit.model_wrappers import GPyModelWrapper
# Initial observations
X = np.array([[1,1,2],[2,1,2],[1,1,1]])
Y = np.array([[1],[2],[3]])
# Surrogate optimization components
kernel = Matern52(
input_dim=X.shape[1],
)
model_gpy = GPRegression(
X=X,
Y=Y,
kernel=kernel,
normalizer=True,
)
model_emukit = GPyModelWrapper(
gpy_model = model_gpy,
)
parameters = [DiscreteParameter(f'param_{i}', range(10)) for i in range(X.shape[1])]
parameter_space = ParameterSpace(parameters)
acquisition_criterion = ExpectedImprovement(model = model_emukit)
f = lambda x_row: np.array([[sum(sum(x_row))]])
f_wrapped = UserFunctionWrapper(f)
control_loop = BayesianOptimizationLoop(
model = model_emukit,
space = parameter_space,
acquisition = acquisition_criterion,
)
# Just make sure that the data is actually represented in the model
assert model_emukit.model.X.shape[0] == X.shape[0]
# Try to set the data using other matrices
X2 = np.array([[3,3,3],[3,4,3]])
Y2 = np.array([[4],[5]])
model_emukit.set_data(X=X2, Y=Y2)
# The data is 'set' after running set_data()
assert model_emukit.model.X.shape[0] == X2.shape[0]
# Provide a result for some arbitrarily suggested point
X_arbitrary_suggestion = np.array([[1,2,5]])
results = f_wrapped(X_arbitrary_suggestion)
X_next = control_loop.get_next_points(
results = results,
)
# As a side effect of control_loop.get_next_points(), the model data is reset.
assert model_emukit.model.X.shape[0] == X.shape[0] + 1
for model_x_row, initial_x_row in zip(model_emukit.model.X, X):
assert all(model_x_row == initial_x_row)
Metadata
Metadata
Assignees
Labels
No labels