Avoid retracing acquisition functions #271

uri-granta · 2021-06-16T09:38:05Z

This PR avoids retracing acquisition functions by updating them rather than generating them afresh each optimization loop, and compiling them with tf.function. For simplicity, this is made optional and backwards compatible: users can choose whether to implement the update methods or not (at the cost of performance if they don't).

Note that updating the model wrappers to be compatible with being updated involves adding some tf.Variables that slow down the non-sparse GPR and VGP models. In some cases this slowdown can outweigh the speedup from not recompiling the acquisition function. If this becomes a real issue in the future we can add more architecture to support models that don't allow AF updates.

TIP: when reviewing this, add ?w=1 to the URL to ignore whitespace changes.

Still left to do (in a separate PR)

trieste/acquisition/rule.py

henrymoss · 2021-06-30T08:24:48Z

trieste/acquisition/sampler.py

+                tf.random.normal([self._sample_size, tf.shape(mean)[-1]], dtype=tf.float64)
            )  # [S, L]


Whats this for? Just for the tensroflow retracing?

Yes. Compiling the graph means the variable mean only knows the statically determined shape (which may have None in it). You have to use tf.shape to get the dynamic shape at each execution.

makes sense and sounds familiar

henrymoss · 2021-06-30T08:26:00Z

trieste/models/model_interfaces.py

@@ -357,7 +368,7 @@ def model(self) -> GPR | SGPR:
        return self._model

    def update(self, dataset: Dataset) -> None:
-        x, y = self.model.data
+        x, y = map(lambda var: var.value(), self.model.data)


what happens if you keep using these as variables?

What do you mean?

does .value mean that the variable just becomes a constant?

I think that value() is needed to get the shape a few lines further down. Also to for the type checking on the DataSet initialisation to pass. Either way, it's fine to be explicit here.

johnamcleod · 2021-07-01T14:18:12Z

trieste/bayesian_optimizer.py

            try:
                if track_state:
-                    models = copy.deepcopy(models)
-                    acquisition_state = copy.deepcopy(acquisition_state)
+                    models_copy = copy.deepcopy(models)


I assume this change is necessary because the deepcopyd compiled models no longer work properly. I presume the Record saves the copy of the model so it can be queried later? Does that version of the model work properly?

Yes, but I'll add a test to show this.

trieste/models/model_interfaces.py

trieste/acquisition/function.py

trieste/models/model_interfaces.py

henrymoss · 2021-07-22T14:37:18Z

tests/unit/test_bayesian_optimizer.py

-    assert len(history) == 4
+    assert len(history) == 3


Why has this changed?

As discussed: this is because we now store the model copy in the history rather than the original model. This test uses a model that can only be copied 3 times, meaning we now only store it three times.

henrymoss · 2021-08-11T10:39:44Z

trieste/models/model_interfaces.py

@@ -499,6 +511,32 @@ def evaluate_loss_of_model_parameters() -> tf.Tensor:
        multiple_assign(self.model, current_best_parameters)


+class NumDataPropertyMixin:
+    """Mixin class for exposing num_data as a property, stored in a tf.Variable. This is to work


What does Mixin mean?

A mixin is "a class that contains methods for use by other classes without having to be the parent class of those other classes". It's useful here as both wrappers want the same behaviour.

henrymoss · 2021-08-11T10:43:33Z

trieste/models/model_interfaces.py

+class SVGPWrapper(SVGP, NumDataPropertyMixin):
+    """A wrapper around GPFlow's SVGP class that stores num_data in a tf.Variable and exposes
+    it as a property."""
+
+


Do we really need this (and the VGP one below). It just complicated things. Why not always assume its a standard SVGP coming into the SParseVariational Class?

Even if we only supported passing in standard SVGPs into SparseVariational, we'd still need this, as we need to turn those standard SVGPs into SVGPs that store num_data in a Variable and expose it as a property.

henrymoss · 2021-08-11T10:49:15Z

trieste/acquisition/function.py

-    optimization. Improvement is with respect to the current "best" observation ``eta``, where an
-    improvement moves towards the objective function's minimum, and the expectation is calculated
-    with respect to the ``model`` posterior. For model posterior :math:`f`, this is
+class expected_improvement:


I do wonder if we should define a base class for these acquisition function bits?

We need documentation saying that these acq functions need an update and a call and what these bits are for

I'll add an ABC (though note these acquisition functions don't need an update: whether they have one, and what it looks like, is specific to the implementations).

henrymoss · 2021-08-11T10:56:57Z

trieste/acquisition/function.py

-    return acquisition
-


How come we dont have a new lower_confidence_bound class (its still a function like before). Just because it doesnt have anything to update doesnt mean we dont want to stop recompiling it?

Because I haven't updated all the acquisition functions yet. See the text box for the PR.

henrymoss · 2021-08-11T10:58:20Z

trieste/acquisition/function.py

+        :return: The updated acquisition function.
+        """
+        tf.debugging.assert_positive(len(dataset))
+        tf.debugging.Assert(None not in [self._base_acquisition_function], [])


What does this line do?

Checks that the base acquisition function already been constructed by a previous call to prepare_acquisition_function. The funny syntax is to make tensorflow happy (and is copied from similar checks elsewhere in this file).

henrymoss · 2021-08-11T11:00:26Z

trieste/acquisition/function.py

+            # if possible, just update the penalization function variables
+            self._penalization.update(pending_points, self._lipschitz_constant, self._eta)
+            return self._penalized_acquisition
+        else:


I dont see a situation where a penalization function isnt updatable! Maybe we can just force them to be updatable

It's still more hassle to write updateable functions than non-updateable ones. And this bit of code is still required for the first time we generate the penalization function, so I don't think we gain anything by removing support for non-updateable functions.

henrymoss · 2021-08-11T14:40:42Z

tests/integration/test_bayesian_optimization.py

+    # check that acquisition functions defined as classes aren't being retraced unnecessarily
+    if isinstance(acquisition_rule, EfficientGlobalOptimization):
+        acquisition_function = acquisition_rule._acquisition_function
+        if isinstance(acquisition_function, AcquisitionFunctionClass):
+            assert acquisition_function.__call__._get_tracing_count() == 3  # type: ignore


uri-granta requested a review from henrymoss June 29, 2021 09:30

uri-granta marked this pull request as ready for review June 30, 2021 08:18

henrymoss reviewed Jun 30, 2021

View reviewed changes

trieste/acquisition/rule.py Show resolved Hide resolved

henrymoss reviewed Jun 30, 2021

View reviewed changes

trieste/acquisition/rule.py Show resolved Hide resolved

henrymoss reviewed Jun 30, 2021

View reviewed changes

uri-granta requested a review from johnamcleod July 1, 2021 12:28

johnamcleod reviewed Jul 1, 2021

View reviewed changes

uri-granta mentioned this pull request Jul 2, 2021

GPflow wrapper update may be incompatible with tf.function() #70

Closed

uri-granta added 15 commits July 8, 2021 08:56

[WIP]

19bee2e

[WIP]

18c38d0

Fix typo

6512f54

Avoid retracing EI function

52ccdc0

Fix a couple of tests

076cac4

Fix another test

fdead09

Fix another test

9aabeec

Update MES acuisition function

0097411

Start updating greedy acquisition functions [WIP]

1c15217

Missing commit [WIP]

11cb00d

Fix most of the mypy errors

8686f1d

Fix last mypy error

ddbc5c4

Start fixing SVGP and VGP models [WIP]

a80cdda

Make num_data variables untrainable

6d7f3fc

Fix SVGP tests

07aeb06

uri-granta force-pushed the uri/avoid_retracing branch from 81ae27c to 07aeb06 Compare July 8, 2021 08:01

uri-granta added 5 commits July 8, 2021 10:07

Start writing tests

c8ce57b

Typing

8d9603c

A couple more tests

0884e8f

Another test

83d09a0

More tests

4f1a281

uri-granta added 5 commits July 13, 2021 10:04

Update doc

50ba2a1

Revert doc change

f1fd037

Don't compile optimizer by default

405746d

Update trust region rule

969b8e2

Merge remote-tracking branch 'upstream/develop' into uri/avoid_retracing

150a8a5

henrymoss reviewed Jul 22, 2021

View reviewed changes

trieste/models/model_interfaces.py Show resolved Hide resolved

henrymoss self-requested a review July 22, 2021 14:35

henrymoss reviewed Jul 22, 2021

View reviewed changes

henrymoss approved these changes Jul 22, 2021

View reviewed changes

uri-granta added 7 commits August 9, 2021 10:06

Merge remote-tracking branch 'upstream/develop' into uri/avoid_retracing

7a9b76c

Fix merge [WIP]

9bc6bbb

More merge fixes

ddc0777

Merge remote-tracking branch 'upstream/develop' into uri/avoid_retracing

a1a7fab

Fix another merge

698856f

Replace some new exceptions with tf.debugging.asserts

1c4aae3

Mypy too stoopid

38d36f3

henrymoss self-requested a review August 11, 2021 10:36

henrymoss reviewed Aug 11, 2021

View reviewed changes

uri-granta added 3 commits August 11, 2021 13:18

Address review comments

ffd608a

Check retracing in integration tests

5a6cee8

Mypy still stoopid

537307b

henrymoss reviewed Aug 11, 2021

View reviewed changes

henrymoss approved these changes Aug 11, 2021

View reviewed changes

uri-granta merged commit a05075b into secondmind-labs:develop Aug 11, 2021

uri-granta mentioned this pull request Jul 1, 2023

Find out when acquisition functions are being retraced #191

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid retracing acquisition functions #271

Avoid retracing acquisition functions #271

uri-granta commented Jun 16, 2021 •

edited

Loading

henrymoss Jun 30, 2021

uri-granta Jun 30, 2021

henrymoss Jun 30, 2021

henrymoss Jun 30, 2021

uri-granta Jun 30, 2021

henrymoss Jun 30, 2021

uri-granta Jul 1, 2021

johnamcleod Jul 1, 2021

uri-granta Jul 8, 2021

henrymoss Jul 22, 2021

uri-granta Jul 22, 2021

henrymoss Aug 11, 2021

uri-granta Aug 11, 2021

henrymoss Aug 11, 2021

uri-granta Aug 11, 2021

henrymoss Aug 11, 2021

henrymoss Aug 11, 2021

henrymoss Aug 11, 2021

uri-granta Aug 11, 2021

henrymoss Aug 11, 2021

uri-granta Aug 11, 2021

henrymoss Aug 11, 2021

uri-granta Aug 11, 2021

henrymoss Aug 11, 2021

uri-granta Aug 11, 2021

henrymoss Aug 11, 2021

		tf.random.normal([self._sample_size, tf.shape(mean)[-1]], dtype=tf.float64)
		) # [S, L]

Avoid retracing acquisition functions #271

Avoid retracing acquisition functions #271

Conversation

uri-granta commented Jun 16, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

uri-granta commented Jun 16, 2021 •

edited

Loading