-
Notifications
You must be signed in to change notification settings - Fork 0
/
discussion.tex
82 lines (70 loc) · 10.8 KB
/
discussion.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
%! TEX root = main.tex
This case study raises significant concerns about the ability of the PINN method to predict flows with instabilities, specifically vortex shedding.
In the real world, vortex shedding is triggered by natural perturbations.
In traditional numerical simulations, however, the shedding is triggered by various numerical noises, including rounding and truncation errors.
These numerical noises mimic natural perturbations.
Therefore, a steady solution could be physically valid for cylinder flow at $Re = 200$ in a perfect world with no numerical noise.
As PINNs are also subject to numerical noise, we expected to observe vortex shedding in the simulations, but the results show that instead the data-free unsteady PINN converged to a steady-state solution.
Even the data-driven PINN reverted back to a steady-state solution beyond the timeframe that was fed with PetIBM's data.
It is unlikely that the steady-state behavior has to do with perturbations.
In traditional numerical simulations, it is sometimes challenging to induce vortex shedding, particularly in symmetrical computational domains.
However, we can still trigger shedding by incorporating non-uniform initial conditions, which serve as perturbations to the steady state solution.
In the data-driven PINN, the training data from PetIBM can be considered as such non-uniform initial conditions.
The vortex shedding already exists in the training data, yet it did not continue beyond the period of data input, indicating that the perturbation is not the primary factor responsible for the steady-state behavior.
This suggests that PINNs have a different reason for their inability to generate vortex shedding compared to traditional CFD solvers.
Other results in the literature that show the two-dimensional cylinder wake \cite{jin_nsfnets_2021} in fact are using high-fidelity DNS data to provide boundary and initial data for the PINN model.
The failure to capture vortex shedding in the data-free mode of PINN was confirmed in recent work by Rohrhofer et al. \cite{rohrhofer_fixedpoints_2023}.
The steady-state behavior of the PINN solutions may be attributed to spectral bias.
Rahaman et al. \cite{rahaman_spectral_2019} showed that neural networks exhibit spectral bias, meaning they tend to prioritize learning low-frequency patterns in the training data.
For cylinder flow, the lowest frequency corresponds to Strouhal number $St=0$.
The data-free unsteady PINN may be prioritizing learning the mode at $St=0$ (i.e., the steady mode) from the Navier-Stokes equations.
The same may apply to the data-driven PINN beyond the timeframe with training data from PetIBM, resulting in a rapid restoration to the non-oscillating solution.
Even within the timeframe with the PetIBM training data, the data-driven PINN may prioritize learning the $St=0$ mode in PetIBM's data.
Although the vortex shedding in PetIBM's data forces the PINN to learn higher-frequency modes to some extent, the shedding modes are generally more difficult to learn due to the spectral bias.
This claim is supported by the history of the drag and lift coefficients of the data-driven PINN (the red dashed line in figure \ref{fig:cylinder-re200-drag-lift}), which was still unable to predict the peak values in $t \in \left[125, 140\right]$, despite extensive training.
The suspicion of spectral bias prompted us to conduct spectral analysis by obtaining Koopman modes, presented in section \ref{sec:cylinder-re200-koopman}.
The Koopman analysis results are consistent with the existence of spectral bias: the data-driven PINN is not able to learn discrete frequencies well, even when trained with PetIBM's data that contain modes with discrete frequencies.
The Koopman analysis on the data-driven PINN's prediction reveals many additional frequencies that do not exist in the training data from PetIBM, and many damped modes that have a damping effect and reduce or prohibit oscillation.
These damped modes may be the cause of the solution restoring to a steady-state flow beyond the timeframe with PetIBM's data.
From a numerical-method perspective, the Koopman analysis shows that the PINN methods in our work are dissipative and dispersive.
The Q-criterion result (figure \ref{fig:cylinder-re200-pinn-qcriterion}) also demonstrates dissipative behavior, which inhibits oscillation and instabilities.
Dispersion can also contribute to the reduction of oscillation strength.
However, it is unclear whether dispersion and dissipation are intrinsic numerical properties or whether we did not train the PINNs sufficiently, even though the aggregated loss had converged (figure \ref{fig:cylinder-re200-pinn-loss}).
Unfortunately, limited computing resources prevented us from continuing the training---already taking orders of magnitude longer than the traditional CFD solver.
More theoretical work may be necessary to study the intrinsic numerical properties of PINNs beyond computational experiments.
Another point worth discussing is the generalizability of data-driven PINNs.
Our case study demonstrates that data-driven PINNs may not perform well when predicting data they have not seen during training, as illustrated by the unphysical predictions generated for $t = 10$ and $t = 50$ in figures \ref{fig:cylinder-re200-pinn-contours-u}, \ref{fig:cylinder-re200-pinn-contours-v}, \ref{fig:cylinder-re200-pinn-contours-p}, and \ref{fig:cylinder-re200-pinn-contours-omega_z}.
While data-driven PINNs are believed to have the advantage of performing extrapolation in a meaningful way by leveraging existing data and physical laws, our results suggest that this ``extrapolation'' capability may be limited.
In data-driven approaches, the training data typically consists of observation data (e.g., experimental or simulation data) and pure spatial-temporal points.
The ``extrapolation'' capability is therefore constrained to the coordinates seen during training, rather than arbitrary coordinates beyond the observation data.
For example, in our case study, $t \in [0, 125]$ corresponds to spatial-temporal points that were never seen during training, $t \in [125, 140]$ contains observation data, and $t \in [140, 200]$ corresponds to spatial-temporal points seen during training but without observation data.
The PINN method's prediction for $t \in [125, 140]$ is considered interpolation.
Even if we accept the steady-state solution as physically valid, then the data-driven PINN can only extrapolate for $t \in [140, 200]$, and fails to extrapolate for $t \in [0, 125]$.
This limitation means that the PINN method can only extrapolate on coordinates it has seen during training.
If the steady-state solution is deemed unacceptable, then the data-driven PINN lacks extrapolation capability altogether and is limited to interpolation.
This raises the interesting research question of how data-driven PINNs compare to traditional deep learning approaches (i.e., those not using PDEs for losses), particularly in terms of performance and accuracy benefits.
It is worth noting that Cai et al. \cite{cai_physics-informed_2021} argue that data-driven PINNs are useful in scenarios where only sparse observation data are available, such as when an experiment only measures flow properties at a few locations, or when a simulation only saves transient data at a coarse-grid level in space and time.
In such cases, data-driven PINNs may outperform traditional deep learning approaches, which typically require more data for training.
However, as we discussed in our previous work \cite{chuang_experience_2022}, using PDEs as loss functions is computationally expensive, increasing the overall computational graph exponentially.
Thus, even in the context of interpolation problems under sparse observation data, the research question of how much additional accuracy can be gained at what cost in computational expense remains an open and interesting question.
Other works have brought up concerns about the limitations of PINN methods in certain scenarios, like flows with shocks \cite{fuks_limitations_2020} and flows with fast variations \cite{krishnapriyan_failure_2021}.
These researchers suggested that the optimization process on the complex landscape of the loss function may be the cause of the failure.
And other works have also highlighted the performance penalty of PINNs compared to traditional numerical methods \cite{grossmann_pinnvsfem_2023}.
In comparison with finite element methods, PINNs were found to be orders of magnitude slower in terms of wall-clock time.
We also observed a similar performance penalty in our case study, where the PINN method took orders of magnitude longer to train than the traditional CFD solver.
We purposely used a very old GPU (NVIDIA Tesla K40) with PetIBM, running on our lab-assembled workstation, while the PINN method was run on a modern GPU (NVIDIA Tesla A100) on a high-performance computing cluster.
However, we did not conduct a thorough performance comparison.
It is unclear what a ``fair'' performance comparison would look like, as the factors affecting runtime are so different between the two methods.
An interesting third option was proposed recently, where the discretized form of the differential equations is used in the loss function, rather than the differential equations themselves \cite{karnakov_odil_2022}.
This approach foregoes the neural-network representation altogether, as the unknowns are the solution values themselves on a discretization grid.
It shares with PINNs the features of solving a gradient-based optimization problem, taking advantae of automatic differentiation, and being easily implemented in a few lines of code thanks to modern ML frameworks.
But it does not suffer from the performance penalty of PINNs, showing an advantage of several orders of magnitude in terms of wall-clock time.
Given that this approach uses a completely different loss function, it supports the claims of other researchers that the loss-function landscape is the source of problems for PINNs.
In this work, we analized data-free PINNs' failure in predicting vortex shedding using dynamic mode analysis, which is a technique based on the physical and mathematical natures of the problem.
Another direction to analyze data-free PINNs' failure is by an ablation study on data-driven PINNs, which is a machine-learning approach.
An ablation study is a technique to investigate the functionality and importance of different components in a machine-learning model by removing components one by one from a working system.
This being said, an ablation study on the interpolation part of the data-driven PINNs can potentially hint us on how vortex shedding is generated in PINNs.
Such an ablation study, nevertheless, will take non-trivial time and resources to conduct, and is left for future work.
Note that an ablation study requires a system that is working to some degree.
It is therefor not proper to apply ablation study with the data-free PINNs and the extrapolation region of the data-driven PINNs as they have not been shown to work yet.
% vim:ft=tex