Two-Stage Learning of Stabilizing Neural Controllers via Zubov Sampling and Iterative Domain Expansion
Our work studies the synthesis of neural controllers with better stability guarantees for continuous-time systems. We propose a novel two-stage training framework to jointly synthesize the controller and Lyapunov function for continuous-time systems. By leveraging a Zubov‑inspired region of attraction characterization to directly estimate stability boundaries, we propose a novel training data sampling strategy and a domain updating mechanism that significantly reduces the conservatism in training. Moreover, we extend the state-of-the-art neural network verifiers alpha-beta-CROWN with the capability of efficiently handling the Jacobian of many previous unsupported operators and propose a novel verification scheme that avoid expensive bisection. We show that our training can yield a region of attraction with volume
The environment can be set up with the following commands.
conda env create -f environment.yaml --name zubov
conda activate zubovThe path can be set up with export PYTHONPATH="${PYTHONPATH}:$(pwd)"
The ROA estimation stage is implemented in train.py. An example training the Van Der Pol system will be
python src/train.py --system "van" --save_dir './van/pretrain/' \
--mode 'learnable' --learning_rate 0.0001 --max_iter 3000 --batch_size 128 --candidate_size 640 \
--lyapunov_thershold 0.99 --log_interval 10 --update_interval 100 --save_interval 200 --traj_dt 0.01 --traj_steps 8000For more information regarding the arguments, please refer to train.py. More examples on higher dimensional systems can be found in run.sh.
The CEGIS refinement stage is implemented in finetune.py. An example for the Van Der Pol system model that just gone through the ROA estimation stage with the above command shall be
python src/finetune.py --system "van" --box 6 9.6 \
--load_dir './van/pretrain/van_3000.pth' \
--save_dir './van/finetune/van.pth' --batch_size 4096 --lyap_lr 1e-4 \
--controller_lr 1e-5 --max_iter 2000 --lower_threshold 0.01 --upper_threshold 0.99 --finetune_epochs 10 \
--pgd_steps 300 --num_points 60000 --consecutive_success 100 --pgd_step_size 0.1 --bdry_scale 1The lower_threshold and upper_threshold corresponds to finetune.py.
All dynamical systems currently supported are implemented in dynamics.py, and most training configs can be found in training_config.py. To add a new system, simply add the new system implementation and an corresponding entry in the config file.
We provided a series of checkpoints for all the systems, which can be found under models. All models except those for 3D Quadrotor has been finetuned with the CEGIS refinement.
We use a state-of-the-art neural network verifier alpha-beta-CROWN to formally verify the Lyapunov condition of our trained controllers and Lyapunov functions. To verify the Lyapunov condition within a levelset bounded by
bash src/verification/run_verify.sh <system_name> <c1> <c2> <path_to_model>For example, to verify the first model trained on the Van Der Pol system, one can run:
bash src/verification/run_verify.sh van 0.0106 0.989 src/models/van/finetune/seed_0.pthA vnnlib specification file will be generated in src/verification/van, and the alpha-beta-CROWN verifier will be executed. The verification result will be appear in the output as:
############# Summary #############
Final verified acc: 100.0% (total 1 examples)
Problem instances count: 1 , total verified (safe/unsat): 1 , total falsified (unsafe/sat): 0 , timeout: 0
indicating that the property has been successfully verified.
Note:
- For the systems discussed in our paper, the corresponding box ranges are predefined in
run_verify.shunder theRADIUS_MAPdictionary. If you add new systems, you will also need to specify their box ranges in that script. - The values of
c1andc2are estimated based on training results. However, as mentioned in the paper, even if no prior values forc1andc2are available (e.g., setting them to 0 and 1 repestively), our verification algorithm can adaptively refine these thresholds and identify the optimal ones. - Our verification configurations has been tested on a GPU with 32GB memory. If you are using a GPU with less memory, consider reducing the batch size of verification by modifying the
batch_sizeparameter in the configuration files to avoid out-of-memory errors.
The final resulting controller/Lyapunov function can be checked and visualized with draw.py, which provides scripts for trajectory based verification and a PGD attack based verification. An example for the same Van Der Pol system will be
python src/draw.py --system "van" --box 6 9.6 \
--load_dir './van/finetune/van.pth' \
--save_dir './van' --plot_idx 0 1 --c1 0.0101 --c2 0.99 \
--compute_volume --dt 0.001 --simulate_steps 50000 --attackIn this command, --attack indicates whether a PGD based checking will be conducted. The --dt and --simulate_steps indicate the hyperparameter used for trajectory based verification. Moreover, if --compute_volume is used, the ROA volume will be computed with a sampling based approach.