Surrogate Model Learning

Learning surrogate modeling from scratch — starting with the basics and working toward real engineering applications.

The idea behind all of this: engineering and environmental simulations are expensive. A single CFD run or 3D hydrodynamic flood scenario can take hours to days at operational scale. Surrogate models let you run a carefully chosen set of those simulations, fit a cheap mathematical approximation, and then use that approximation for everything else — optimization, uncertainty analysis, large-scale scenario exploration. This repo is me figuring out how to build those approximations properly.

What's in here

Project 1 : GP Surrogate for the Branin Function

notebooks/01_gp_surrogate_branin.ipynb notebooks/README_project1.md

The Branin function is a standard 2D benchmark that looks like a hilly landscape. I used it as a stand-in for an expensive simulation, sampled it at 20 carefully chosen points using Latin Hypercube Sampling, trained a Gaussian Process on those results, and asked it to predict everywhere else.

Result: R² = 0.9553 from 20 training points

The most interesting part wasn't the accuracy number, it was the uncertainty map. The GP correctly identified the corners of the design space as its weakest predictions, exactly where there was no training data. That's the GP telling you where to run your next simulation.

Project 2 : Cantilever Beam Deflection Surrogate

notebooks/02_beam_deflection_surrogate.ipynb notebooks/README_project2.md

First real engineering application. A cantilever beam's tip deflection depends on four variables: applied force, beam length, Young's modulus, and second moment of area. I treated the analytical formula as an expensive FEA solver and built a GP surrogate to replace it.

First attempt with 30 samples gave R² = 0.64 and 19% average error. Not good enough. The error plot showed the surrogate was struggling hardest at small deflection values which is a classic sign of sparse coverage in a 4D space.

Two fixes: bumped samples to 80, and log-transformed the inputs that span orders of magnitude (E and I). That second fix turned out to matter more than the first.

Result: R² = 1.0 | MAPE = 0.26% from 80 training points

The lesson here was that feature engineering matters more than model choice. The GP didn't change at all, it is just how the data was fed to it.

Project 3 : Surrogate Method Comparison

notebooks/03_surrogate_comparison.ipynb notebooks/README_project3.md

Same dataset, three different surrogate methods head to head on the Rosenbrock function which is a nonlinear benchmark with a curved valley that's easy to find but hard to follow.

Method	R²	MAPE	Train Time
RSM	0.247	1763%	0.004s
GP	1.000	0.55%	0.146s
RBF	0.895	187%	0.002s

RSM failed completely as a degree-2 polynomial can't represent a curved valley. RBF got the shape roughly right but struggled at the edges. GP has the perfect value, which means we need to add some noise to it.

The takeaway: for nonlinear problems with limited data, GP is worth the extra training time. RSM only makes sense when you have strong reason to believe the response is nearly quadratic. RBF sits in the middle that is, fast and decent, but no uncertainty estimates.

The practical implication: for high-stakes scenario analysis — flood inundation mapping, climate risk propagation, any setting where each simulation run is expensive and data is limited — GP's built-in uncertainty estimates make it the right default.

Where this is heading

The methods developed here — GP surrogates, multifidelity approximation, uncertainty-aware sampling — are foundational to a larger question: can neural operator learning (DeepONet) generalize these ideas to PDE-governed field problems, where the goal is to learn the solution operator itself rather than a single scalar response? That work lives in a separate repository: NURBS_BEM_EMSolver, where I built a mesh-free electromagnetic field solver from scratch and trained a DeepONet on 500+ parametric geometries to learn the input-to-field operator directly from data.

Tools

Python · NumPy · scikit-learn · pyDOE2 · SciPy · Matplotlib · Jupyter

How to run

pip install -r requirements.txt
jupyter notebook

Notebooks

Notebook	Topic	Key Result
`01_gp_surrogate_branin.ipynb`	GP surrogate, 2D benchmark	R² = 0.9553
`02_beam_deflection_surrogate.ipynb`	Beam deflection, 4D engineering problem	R² = 1.0
`03_surrogate_comparison.ipynb`	RSM vs GP vs RBF comparison	GP wins

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
notebooks		notebooks
results		results
READMe.md		READMe.md
Requirements.txt		Requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Surrogate Model Learning

What's in here

Project 1 : GP Surrogate for the Branin Function

Project 2 : Cantilever Beam Deflection Surrogate

Project 3 : Surrogate Method Comparison

Where this is heading

Tools

How to run

Notebooks

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Surrogate Model Learning

What's in here

Project 1 : GP Surrogate for the Branin Function

Project 2 : Cantilever Beam Deflection Surrogate

Project 3 : Surrogate Method Comparison

Where this is heading

Tools

How to run

Notebooks

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages