Machine-learning online optimization of labscript suite controlled experiments
analysislib-mloop implements machine-learning online optimization of labscript suite controlled experiments using M-LOOP.
- lyse 2.5.0
- runmanager 2.5.0+ (remote-bugfix)
- labscript_utils 2.12.4
- zprocess 2.13.2
- M-LOOP 2.2.0+
- For python versions older than 3.11
tomllib
is not included andtomli
must be installed instead.- tomli 2.0.1
The following assumes you have a working installation of the labscript suite and M-LOOP. Please see the installation documentation of these projects if you don't.
Clone this repository in your labscript suite analysislib directory. By default, this is ~/labscript-suite/userlib/analysislib
(~
is %USERPROFILE%
on Windows).
The following assumes you already have an experiment controlled by the labscript suite.
- Specify server and port of runmanager in your labconfig, i.e. ensure you have the following entries if their values differ from these defaults:
[servers]
runmanager = localhost
[ports]
runmanager = 42523
- Configure optimization settings in
mloop_config.toml
. TOML is a high-level configuration file format and its description can be found here. The repository contains a fully functional demomloop_config_example.toml
that optimizes the MOT and cMOT laser cooling stages at the RbChip experiment at NIST Gaithersburg. At a bare minimum, you should modify the following:
[ANALYSIS]
cost_key = ["fake_result", "y"]
maximize = true
groups = ["MOT"]
[MLOOP_PARAMS.MOT.x]
global_name = "x"
enable = true
min = -5.0
max = 5.0
start = -2
cost_key
: Column of the lyse dataframe to derive the cost from, specified as a[routine_name, result_name]
pair. The present cost comes from the most recent value in this column, i.e.cost = df[cost_key].iloc[-1]
.maximize
: Whether or not to negate the above value, since M-LOOP will minimize the cost.groups
: Which group(s) of parameters are active, specified as a list of groups such as["MOT", "CMOT"]
. This is to simplify the optimization of different groups of parameters.MLOOP_PARAMS
: Dictionary of optimization parameters controlled by MLOOP.global_name
defines the global it maps to in runmanager.enable
allows parameters to be enabled or disabled on a case-by-case basis. This may be omitted and defaults totrue
.min
,max
,start
correspond tomin_boundary
,max_boundary
, andfirst_params
lists to meet M-LOOP specifications.
You may also specify a more complicated mapping between the parameters controller by MLOOP and globals in runmanager.
[MLOOP_PARAMS.CMOT.y]
min = -5.0
max = 5.0
start = -2
[MLOOP_PARAMS.CMOT.z]
min = -5.0
max = 5.0
start = -2
[RUNMANAGER_GLOBALS.CMOT.test_tuple]
expr = "lambda y, z: (y, z)"
args = ["y", "z"]
- parameters may be shared between different groups, but the group must be enabled and the parameter must be enabled.
y
andz
are two MLOOP parameters that don't have aglobal_name
defined.- Instead, a dictionary entry in
RUNMANAGER_GLOBALS
, targeting globaltest_tuple
in runmanager, is explicitly defined here with a customized mappinglambda y, z: (y, z)
, which takesy
andz
as parameters. Every time, the tuple(y, z)
will be passed totest_tuple
in runmanager.
This might be useful if you have organized your runmanager variables into more complicated data structures such as tuples, dictionaries, or whatever.
- Load the analysis routine that computes the quantity you want to optimize into lyse. This routine should update
cost_key
of the lyse dataframe by calling thesave_result
(or its variants) of alyse.Run
. For the above parameters, this would befake_result.py
containing:
import lyse
run = lyse.Run(lyse.path)
# Your single-shot analysis code goes here
run.save_result('y', your_result)
-
Load
mloop_multishot.py
as an analysis routine in lyse. Ensure that it runs after the analysis routine that updatescost_key
, e.g.fake_result.py
in the above configuration, using the (move routine) up/down buttons. -
Begin automated optimization by doing one of the following:
- Press the 'Run multishot analysis' button in lyse.
- This requires the globals specified in
mloop_params
are active in runmanager; unless you - Set
mock = true
inmloop_config.toml
, which bypasses shot compilation and submission, and generates a fake cost based on the current value of the first optimization parameter. Each press of 'Run multishot analysis' will elicit another M-LOOP iteration. This is useful for testing your M-LOOP installation and the threading/multiprocessing used in this codebase, as it only requires that lyse be running (and permits you to skip creating the template file and performing steps (1) and (3) above).
- This requires the globals specified in
- Press the 'Engage' button in runmanager. Either of these will begin an M-LOOP optimization, with a new sequence of shots being compiled and submitted to blacs each time a cost value is computed.
- Press the 'Run multishot analysis' button in lyse.
-
Pause optimization by pausing the lyse analysis queue or by unchecking (deactivating) the
mloop_multishot.py
in lyse. -
Cancel or restart optimization by removing
mloop_multishot.py
or by right-clicking on it and selecting 'restart worker process for selected routines'.
Uncertaintes in the cost can be specified by saving the uncertainty with a name 'u_' + result_name
. For the example in step (3) above, this can be done as follows:
import lyse
run = lyse.Run(lyse.path)
# Your single-shot analysis code goes here
run.save_result('y', your_result)
run.save_result('u_y', u_your_result)
# ... or:
run.save_results_dict({'y', (your_result, u_your_result)}, uncertainties=True)
The cost can be the result of multi-shot analysis (requiring more than one shot to evaluate). Suppose you only want to return a cost value after:
- a certain number of shots (repeats or those in a labscript sequence) have completed, and/or
- the uncertainty in some multi-shot analysis result is below some threshold.
In such cases, you would include the following in your multi-shot analysis routine:
df = lyse.data()
# Your analysis on the lyse DataFrame goes here
run = lyse.Run(h5_path=df.filepath.iloc[-1])
run.save_result(name='y', value=your_result if your_condition else np.nan)
... and set ignore_bad = true
in the analysis section of mloop_config.toml
. Shots with your_condition = False
will be not elicit the cost to be updated, thus postponing the next iteration of optimization. An example of such a multi-shot routine can be found in fake_result_multishot.py.
Since cost evaluation can be based on one or more shots from one or more sequences, additional information is required to analyze a single M-LOOP optimization session in lyse. Per-shot cost evaluation (e.g. of a single-shot analysis result) results in a single-shot sequence per M-LOOP iteration. For multi-shot cost evaluation, a single M-LOOP iteration might correspond to a single multi-shot sequence, repeated execution of the same shot (same sequence_index
and run number
, different run repeat
), or something else. To keep track of this, we intend to add details of the optimization session to the sequence attributes (written to each shot file). For the time being, you can keep track of the mloop_session
and mloop_iteration
by creating globals with these names in any active group in runmanager. They will be updated during each optimization, and reset to None
following the completion of an M-LOOP session. This then permits you to analyze shots from a particular optimization session as follows:
import lyse
df = lyse.data()
gb = df.groupby('mloop_session')
mloop_session = list(gb.groups.keys())[-1]
subdf = gb.get_group(mloop_session)
There's an example of this in plot_mloop_results.py.
M-LOOP itself has visualisation functions which can be run on the log/archive files it creates.
The mloop_multishot.py
script can be loaded as a single-shot analysis routine if cost_key
derives from another single-shot routine, so long as it runs after that routine.
Despite the name, mloop_multishot.py
can be used for other automated optimization and feed-forward. You can run any function the optimization thread (see below), so long as it conforms to the following specification:
- Calls
lyse.routine_storage.queue.get()
iteratively. - Uses the
cost_dict
returned to modify global variables (which ones and how is up to you) usingrunmanager.remote.set_globals()
. - Calls
runmanager.remote.engage()
when a new shot or sequence of shots is required to get the next cost (optional).
Feed-forward stabilization (e.g. of some drifting quantity) could be readily achieved using a single-iteration optimization session, replacing main
of mloop_interface.py with, for example:
import lyse
from runmanager.remote import set_globals, engage
def main():
# cost_dict['cost'] is likely some error signal you are trying to zero
cost_dict = lyse.routine_storage.queue.get()
# Your code goes here that determines the next value of a stabilization parameter
set_globals('some_global': new_value)
return
If an alternative optimization library requires something other than cost_dict
(with keys cost
, uncer
, bad
), you can modify cost_analysis
accordingly.
We use lyse.routine_storage
to store:
- a long-lived thread (
threading.Thread
) to run the main method ofmloop_interface.py
withinmloop_multishot.py
, - a queue (
Queue.Queue
) formloop_multishot.py
/mloop_interface.py
to put/get the latest M-LOOP cost dictionary, and - (when
mock = true
) a variablex
formloop_interface.py
/mloop_multishot.py
to set/get, for spoofing ancost_key
that changes with the current value of the (first) M-LOOP optimization parameter.
Each time the mloop_multishot.py
routine runs in lyse, we first check to see if there is an active optimization by polling the optimization thread. If it doesn't exist or is not alive, we start a new thread. If there's an optimization underway, we retrieve the latest cost value from the lyse dataframe (see the cost_analysis
function) and put it in the lyse.routine_storage.queue
.
The LoopInterface
subclass (of mloop.interface.Interface
) has a method get_next_cost_dict
, which:
- requests the next experiment shot(s) be compiled and run using
runmanager.remote.set_global()
andrunmanager.remote.engage()
, and - waits for the next cost using a blocking call to
lyse.routine_storage.queue.get()
.
The main method of mloop_interface.py
follows the trend of the M-LOOP Β» Python controlled experiment tutorial:
- Instantiate
LoopInterface
, an M-LOOP optimizer interface. - Get the current configuration.
- Create an
mloop.controllers.Controller
instance for the optimizer interface, using the above configuration. - Run the
optimize
method of this controller. - Return a dictionary of
best_params
,best_cost
,best_uncer
,best_index
.
Shots are compiled by programmatically interacting with the runmanager GUI. The current value of the optimization parameters used by M-LOOP are reflected in runmanager, and when a given optimization is complete, the best parameters are entered into runmanager programmatically.
The original design and implementation occurred during the summer of 2017/2018 by Josh Morris, Ethan Payne, Lincoln Turner, and I, with assistance from Chris Billington and Phil Starkey. In this incarnation, the M-LOOP interface and experiment interface were run as standalone processes in a shell, with communication between these two actors and the analysis interface being done over a ZMQ socket. Experiment scripts were compiled against an otherwise empty 'template' shot file of globals, which was modified in place at each M-LOOP iteration. This required careful execution of the scripts in the right order, and for the M-LOOP interface to be restarted after each optimization, and was a bit clunky/flaky.
In 2019 we improved this original implementation using a single lyse analysis routine (the skeleton of which was written by Phil Starkey), and remote control of the runmanager GUI. This required the following enhancements and bugfixes to the labscript suite, which Chris Billington (mostly) and I undertook:
- lyse PR #61: Fix for #48: Make analysis_subprocess.py multiprocessing-friendly
- lyse PR #62: Terminate subprocesses at shutdown
- runmanager PR #37: Basic remote control of runmanager
- runmanager PR #39: Bugfix of above
- labscript_utils PR #78: Basic remote control of runmanager): Import pywin32 at module-level rather than lazily
- labscript PR #81: Basic remote control of runmanager): Include all package dirs in Modulewatcher whitelist
M-LOOP was written by Michael Hush and is maintained by M-LOOP contributors.
- Validation and error checks (#1).
- Sequence attributes that record the optimization details.
- Generalize this implementation to other algorithmic optimization libraries.
If you are an existing labscript suite user, please test this out on your experiment! Report bugs, request new functionality, and submit pull requests using the issue tracker for this project.
If you'd like to implement machine-learning online optimization on your shot-based, hardware-timed experiment, please consider deploying the labscript suite and M-LOOP (or another machine learning library, by adapting this extension).