-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy patharticle.tex
1379 lines (1198 loc) · 62.6 KB
/
article.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
%% Journal of Open Research Software Latex template -- Created By Stephen
%% Bonner and John Brennan, Durham Universtiy, UK.
%% see http://openresearchsoftware.metajnl.com
% \documentclass{../jors}
% {\bf Software paper for submission to the Journal of Open Research Software} \\
%
% \rule{\textwidth}{1pt}
\section{Overview}
%
% \vspace{0.5cm}
%
% \subsection{Title}
%
% FluidSim: modular, object-oriented Python package for high-performance CFD
% simulations
%
% \subsection{Paper Authors}
%
% 1. MOHANAN, Ashwin Vishnu$^a$ \\
% 2. BONAMY, Cyrille$^b$ \\
% 3. CALPE LINARES, Miguel$^b$ \\
% 4. AUGIER, Pierre$^b$
%
% \smallskip
%
% $^a$ Linn\'e Flow Centre, Department of Mechanics, KTH, 10044 Stockholm, Sweden.\\
% $^b$ Univ. Grenoble Alpes, CNRS, Grenoble INP\footnote{Institute of Engineering
% Univ. Grenoble Alpes}, LEGI, 38000 Grenoble, France.
%
% \subsection{Paper Author Roles and Affiliations}
% 1. Ph.D. student, Linn\'e Flow Centre, KTH Royal Institute of Technology,
% Sweden; \\
% 2. Research Engineer, LEGI, Universit\'e Grenoble Alpes, CNRS, France; \\
% 3. Ph.D. student, LEGI, Universit\'e Grenoble Alpes, CNRS, France; \\
% 4. Researcher, LEGI, Universit\'e Grenoble Alpes, CNRS, France
%
% \subsection{Abstract}
%
%\textcolor{blue}{A short (ca. 100 word) summary of the software being
%described: what problem the software addresses, how it was implemented and
%architected, where it is stored, and its reuse potential.}
% The Python package \fluidpack{sim} is introduced in this article as an extensible
% framework for Computational Fluid Mechanics (CFD) solvers.
% %
% It is developed as a part of \href{https://fluiddyn.readthedocs.io}{FluidDyn}
% project \citep{fluiddyn}, an effort to promote open-source and open-science
% collaboration within fluid mechanics community and intended for both educational
% as well as research purposes.
% %
% Solvers in \fluidpack{sim} are scalable, High-Performance Computing (HPC) codes
% which are powered under the hood by the rich, scientific Python ecosystem and the
% Application Programming Interfaces (API) provided by \fluidpack{dyn} and
% \fluidpack{fft} packages \citep{fluidfft}.
% %
% The present article describes the design aspects of \fluidpack{sim}, which
% includes use of Python as the main language; focus on the ease of use, reuse
% and maintenance of the code without compromising performance.
% %
% The implementation details including optimization methods, modular organization
% of features and object-oriented approach of using classes to implement solvers
% are also briefly explained.
% %
% Currently, \fluidpack{sim} includes solvers for a variety of physical problems
% using different numerical methods (including finite-difference methods).
% %
% However, this metapaper shall dwell only on the implementation and performance
% of its pseudo-spectral solvers, in particular the two- and three-dimensional
% Navier-Stokes solvers.
% %
% We investigate the performance and scalability of \fluidpack{sim} in a
% state of the art HPC cluster.
% %
% Three similar pseudo-spectral CFD codes based on Python (Dedalus, SpectralDNS) and
% Fortran (NS3D) are presented and qualitatively and quantitatively compared to
% \fluidpack{sim}.
% %
% The source code is hosted at Bitbucket as a Mercurial repository
% \href{https://bitbucket.org/fluiddyn/fluidsim}{bitbucket.org/fluiddyn/fluidsim}
% and the documentation generated using Sphinx can be read online at
% \href{https://fluidsim.readthedocs.io}{fluidsim.readthedocs.io}.
%
% \subsection{Keywords}
%
% % Python;\ CFD;\ HPC;\ MPI;\ modular;\ object-oriented;\ tested;\ documented;\
% % open-source
% 0D; 1D; 2D; 3D; pseudo-spectral; finite difference; computational fluid
% dynamics; sequential; MPI; GPU
\subsection{Introduction}
% \textcolor{blue}{An overview of the software, how it was produced, and the
% research for which it has been used, including references to relevant research
% articles. A short comparison with software which implements similar
% functionality should be included in this section. }
Designed as a specialized package of the
\href{https://fluiddyn.readthedocs.io}{FluidDyn} project for computational fluid
mechanics (CFD), \fluidpack{sim} is a comprehensive solution to address the needs
of a fluid mechanics student and researcher alike --- by providing scalable high
performance solvers, on-the-fly postprocessing, and plotting functionalities under
one umbrella. In the open-science paradigm, scientists will be both users and
developers of the tools at the same time.
%
An advantage of \fluidpack{sim} is that, most of the users just have to read and
write Python code. \fluidpack{sim} ensures that all critical modules, classes and
functions are well documented --- both as inline comments and as standalone
documentation, complete with examples and tutorials. For these reasons
\fluidpack{sim} can become a true collaborative code and has the potential to
replace some in-house pseudo-spectral codes written in more conventional
languages.
\paragraphbf{Balance between runtime efficiency and cost of development}
In today's world where clusters are moving from petascale to exascale
performance, computing power is aplenty. In most research scenarios, it becomes
apparent that \emph{man-hours are more expensive than computing time}. In other
words, the cost of development outweighs the cost of computing time. Therefore,
developers should be willing to make small sacrifices in efficiency, to improve
development time, code maintainability and clarity in general. In exceptional
cases when computing time become more expensive, one can still use
existing optimized and performance-tuned compiled libraries implemented in
native languages as the core, and rely on high level programming languages to
extend them for specific applications\footnote{See
\citet{ramachandran_pysph_2016} for a similar discussion}.
For the above reasons, majority of \fluidpack{sim}'s code-base, in terms of
line of code, is written using pure Python syntax. However, this is done
without compromising performance, by making use of libraries such as \Numpy,
and optimized compilers such as \pack{Cython} and \pack{Pythran}.
\Numpy functions and data types are sufficient for applications such as
initialization and postprocessing operations, since these functions are used
sparingly. Computationally intensive tasks such as time-stepping and linear
algebra operators which are used in every single iteration must be offloaded to
compiled extensions.
%
This optimization strategy can be considered as the computational equivalent of
the \href{https://en.wikipedia.org/wiki/Pareto_principle}{Pareto principle}, also
known as the 80/20 rule\footnote{See \citet{behnel_cython2011},
\href{https://wiki.haskell.org/Why_Haskell_matters}{%
wiki.haskell.org/Why\_Haskell\_matters}}.
%
The goal is to optimize such that ``80 percent of the runtime is spent in
20 percent of the code''~\citep[]{meyers2012effective}.
%
Here, \pack{Cython} \citep{behnel_cython2011} and \pack{Pythran}
\citep{guelton2018pythran} compilers comes in handy.
%
% These two compilers are presented in the companion paper on \fluidpack{dyn}
% \citep{fluiddyn} and
An example on how we use \pack{Pythran} to reach similar performance than with
Fortran by writing only Python code is described in the companion paper on
\fluidpack{fft} \citep{fluidfft}.
% There are some key differences between these packages.
% \pack{Cython}\~\cite{behnel_cython2011} is a very generic and mature library
% based on Pyrex language --- which is Python with some additional syntax. It has
% the capability to interface between other C or C++ code by generating shared
% libraries which can be imported by Python. \pack{Cython} can also generate such
% libraries from scratch and even has object-oriented syntax if required. Some of
% the drawbacks of \pack{Cython} include use of Pyrex which can feel verbose and
% esoteric to a Python developer. While it supports \Numpy\ arrays as inputs, it
% requires the developer to explicitly write loops using indices, which can be
% verbose and less generic than array notations.
% On the other hand, \pack{Pythran}\~\cite{guelton_pythran2013,
% guelton2018pythran} is a relatively recent development, specializing in
% scientific applications. \pack{Pythran} can export simple Python functions into
% extensions with the help of type annotations written in Python comments. When
% \pack{Pythran} is not being used these functions can work like plain Python
% functions. \pack{Pythran} can also recognize \Numpy functions and optimizes
% code while building extensions internally. However, \pack{Pythran} does not
% support classes and custom data types as function parameters.
The result of using such an approach is studied in the forthcoming sections by
measuring the performance of \fluidpack{sim}.
%
We will demonstrate that a very large percentage of the elapsed time is spent in
the execution of optimized compiled functions and thus that the ``Python cost'' is
negligible.
\paragraphbf{Target audiences}
\fluidpack{sim} is designed to cater to the needs of three kinds of audience.
\begin{itemize}
\item \emph{Users}, who run simulations with already available solvers. To do
this, one needs to have very basic skills in Python scripting.
\item \emph{Advanced users}, who may extend \fluidpack{sim} by developing
customized solvers for new problems by taking advantage of built-in operators and
time-stepping classes. In practice, one can easily implement such solvers by
changing a few methods in base solver classes. To do this, one needs to have
fairly good skills in Python and especially object-oriented programming.
\item \emph{Developers}, who develop the base classes, in particular, the
operators and time stepping classes. One may also sometime need to write compiled
extensions to improve runtime performance. To do this, desirable traits include
good knowledge in Python, \Numpy, \pack{Cython} and \pack{Pythran}.
\end{itemize}
This metapaper is intended as a short introduction to \fluidpack{sim} and its
implementation, written mainly from a user-perspective. Nevertheless, we also discuss how
\fluidpack{sim} can be customized and extended with minimal effort to promote code
reuse.
%
A more comprehensive and hands-on look at how to use \fluidpack{sim} can be found
in the tutorials\footnote{See
\href{https://fluidsim.readthedocs.io/en/latest/tutorials.html}{fluidsim.readthedocs.io/en/latest/tutorials.html}},
both from a user's and a developer's perspective.
%
In the latter half of the paper, we shall also inspect the performance of
\fluidpack{sim} in large computing clusters and compare \fluidpack{sim} with three
different pseudo-spectral CFD codes.
\subsection{Implementation and architecture}
% \textcolor{blue}{How the software was implemented, with details of the
% architecture where relevant. Use of relevant diagrams is appropriate. Please
% also describe any variants and associated implementation differences.}
New features were added over the years to the package whenever demanded by
research interests, thus making the code very user-centric and
function-oriented. This aspect of code development is termed as YAGNI, one of
the principles of agile programming software development method, which
emphasizes not to spent a lot of time developing a functionality, because most
likely \emph{you aren't gonna need it}.
Almost all functionalities in \fluidpack{sim} are implemented as classes and its
methods are designed to be modular and generic. This means that the user is free
to use inheritance to modify certain parts to suit one's needs, and avoiding the
risk of breaking a functioning code.
\paragraphbf{Package organization}
\fluidpack{sim} is meant to serve as a framework for numerical solvers using
different methods. For the present version of \fluidpack{sim} there is support for
finite difference and pseudo-spectral methods. An example of a finite difference
solver is \codeinline{fluidsim.solvers.ad1d} which solves the 1D advection
equation. There are also solvers which do not rely on most of the base classes,
such as \codeinline{fluidsim.base.basilisk} which implements a 2D adaptive meshing
solver based on the CFD code \href{http://basilisk.fr/}{Basilisk}. The collection
of solvers using pseudo-spectral methods are more feature-rich in comparison.
The code is organized into the following sub-packages:
\begin{itemize}
\item \codeinline{fluidsim.base}: contains all base classes and a solver for the
trivial equation $\partial_t \mathbf{\hat{u}} = 0 $.
\item \codeinline{fluidsim.operators}: specialized linear algebra and numerical
method operators (e.g., divergence, curl, variable transformations, dealiasing).
\item \codeinline{fluidsim.solvers}: solvers and postprocessing modules for
problems such as 1D advection, 2D and 3D Navier-Stokes equations (incompressible
and under the Boussinesq approximation, with and without density stratification
and system rotation), one-layer shallow water and F\"oppl-von K\'arm\'an
equations.
\item \codeinline{fluidsim.util}: utilities to load and modify an existing
simulation, to test, and to benchmark a solver.
\end{itemize}
Subpackages \codeinline{base} and \codeinline{operators} form the backbone of this
package, and are not meant to be used by the user explicitly.
%
In practice, one can make an entirely new solver for a new problem using this
framework by simply writing one or two importable files containing three classes:
\begin{itemize}
\item an \codeinline{InfoSolver} class\footnote{Inheriting from the base class
\codeinline{fluidsim.base.solvers.info\_base.InfoSolverBase}.}, containing the
information on which classes will be used for the different tasks in the solver
(time stepping, state, operators, output, etc.).
\item a \codeinline{Simulation} class\footnote{Inheriting from the base class
\codeinline{fluidsim.base.solvers.base.SimulBase}.} describing the equations to be
solved.
\item a \codeinline{State} class\footnote{Inheriting from the base class
\codeinline{fluidsim.base.state.StateBase}.} defining all physical variables and
their spectral counterparts being solved (for example: $u_x$ and $u_y$) and
methods to compute one variable from another.
\end{itemize}
We now turn our attention to the simulation object which illustrates how to
access the code from the user's perspective.
\paragraphbf{The simulation object}
The simulation object is instantiated with necessary parameters just before
starting the time stepping. A simple 2D Navier-Stokes simulation can be
launched using the following Python script:
\begin{minted}[fontsize=\footnotesize]{python}
from fluidsim.solvers.ns2d.solver import Simul
params = Simul.create_default_params()
# Modify parameters as needed
sim = Simul(params)
sim.time_stepping.start()
\end{minted}
The script first invokes the \codeinline{create\_default\_params}
\href{https://docs.python.org/3/library/functions.html#classmethod}{classmethod}
which returns a \codeinline{Parameters} object, typically named
\codeinline{params} containing all default parameters.
%
Modifications to simulation parameters are made after this step, to meet the
user's needs. The simulation object is then instantiated by passing
\codeinline{params} as the only argument, typically named \codeinline{sim},
ready to start the iterations.
As demonstrated above, parameters are stored in an object of the class
\codeinline{Parameters}, which uses the
\href{https://fluiddyn.readthedocs.io/en/latest/generated/fluiddyn.util.paramcontainer.html}{\codeinline{ParamsContainer}}
API\footnote{See
\href{https://fluiddyn.readthedocs.io/en/latest/generated/fluiddyn.util.paramcontainer.html}{fluiddyn.readthedocs.io/en/latest/generated/fluiddyn.util.paramcontainer.html}}
of \fluidpack{dyn} package \cite{fluiddyn}.
%
Parameters for all possible modifications to initialization, preprocessing,
forcing, time-stepping, and output of the solvers is incorporated into the
object \codeinline{params} in a hierarchical manner.
%
Once initialized, the ``public'' (not hidden) API does not allow to add new
parameters to this object and only modifications are permitted.\footnote{Example
on modifying the parameters for a simple simulation:
\href{https://fluidsim.readthedocs.io/en/latest/examples/running-simul-onlineplot.html}{%
fluidsim.readthedocs.io/en/latest/examples/running\_simul.html}}
This approach is different from conventional solvers reliant on
text-based input files to specify parameters, which is less robust and can
cause the simulation to crash due to human errors during runtime.
%
A similar, but less readable approach to \codeinline{ParamsContainer} is
adopted by OpenFOAM which relies on \codeinline{dictionaries} to customize
parameters. The \codeinline{params} object can be printed on the Python or
IPython console and explored interactively using tab-completion, and can be
loaded from and saved into XML and HDF5 files, thus being very versatile.
Note that the same simulation object is used for the plotting and post-processing
tasks. During or after the execution of a simulation, a simulation object can be
created with the following code (to restart a simulation, one would rather use the
function \codeinline{fluidsim.load\_state\_phys\_file}.):
\begin{minted}[fontsize=\footnotesize]{python}
from fluidsim import load_sim_for_plot
# in the directory of the simulation
sim = load_sim_for_plot()
# or with the path of the simulation
# sim = load_sim_for_plot("~/Sim_data/NS2D.strat_240x240_S8x8_2018-04-20_13-45-54")
# to retrieve the value of a parameter
print(f"viscosity = {sim.params.nu_2}")
# to plot the space averaged quantities versus time
sim.output.spatial_means.plot()
# to load the corresponding data
data = sim.output.spatial_means.load()
# for a 2d plot of the variable "b"
sim.output.phys_fields.plot("b", time=2)
# to save a 2d animation
sim.output.phys_fields.animate("b", tmin=1, tmax=5, save_file=True)
\end{minted}
% \begin{figure}[htp]
% \centering
% \includegraphics[width=\linewidth]{Fig/fig_callgraph_inkscape_simple}
% \caption{Simplified callgraph tracing the initialization of a simulation object}\label{fig:callgraph}
% \end{figure}
%
% Fig.~\ref{fig:callgraph} provides some insight into how the base simulation
% class of all pseudo-spectral solvers are intialized,
% i.e. \codeinline{fluidsim.base.solvers.SimulBasePseudoSpectral}. All necessary
% sub-classes are then initialized by the \codeinline{\_\_init\_\_} function of
% the simulation class and customized as specified by the parameters. The objects
% of all sub-classes are attached to the simulation object, as indicated with blue
% labels in Fig.~\ref{fig:callgraph}. The sub-classes are responsible for a
% unique set of functionalities for the solver to work.
\begin{figure}[htp]
\centering
\includegraphics[width=\linewidth]{Fig/fig_simul_obj}
\caption{%
\href{https://www.uml-diagrams.org}{UML diagram} of the simulation
object (\codeinline{sim}) for the solver
\codeinline{fluidsim.solvers.ns3d}.
%
Each block represents an object or instance of a class, and the object name
and the class name are written as headings. The solid arrows show how objects are
associated with each other.
%
Methods and variables of significance to the user are displayed in the body of
each object block.}
\label{fig:simul}
\end{figure}
Fig.~\ref{fig:simul} demonstrates how an user can access different objects and
its associated methods through the simulation object for the solver
\codeinline{fluidsim.solvers.ns3d}. Do note that the object and method names
are similar, if not same, for other solvers. The purpose of the objects are
listed below in the order of instantiation:
\begin{outline}
\1 \codeinline{sim.params}: A copy of the \codeinline{params} object supplied
by the user, which holds all information critical to run the simulation and
generating the output.
%
\1 \codeinline{sim.info\_solver}: Contains all the subclass information,
including the module the class belongs to.
%
\1 \codeinline{sim.info}: A union of the \codeinline{sim.info\_solver} and the
\codeinline{sim.params} objects.
%
\1 \codeinline{sim.oper}: Responsible for generating the grid, and for
pseudo-spectral numerical methods such as FFT, IFFT, dealiasing, divergence,
curl, random arrays, etc.
%
\1 \codeinline{sim.output}: Takes care of all on-the-fly post-processing
outputs and functions to load and plot saved output files. Different objects
are assigned with tasks of loading, plotting and sometimes computing:
%
\2 \codeinline{sim.output.print\_stdout}: the mean energies, time elapsed and
time-step of the simulation printed as console output.
%
\2 \codeinline{sim.output.phys\_fields}: the state variables in the physical
plane. It relies on \codeinline{sim.state} to load or compute the variables
into arrays.
%
\2 \codeinline{sim.output.spatial\_means}: mean quantities such as energy,
enstrophy, forcing power, dissipation.
%
\2 \codeinline{sim.output.spectra}: energy spectra as line plots (i.e.\ as
functions of the module or a component of the wavenumber).
%
\2 \codeinline{sim.output.spect\_energy\_budg}: spectral energy budget by
calculating the transfer term.
%
\2 \codeinline{sim.output.increments}: structure functions from physical
velocity fields.
%
\1 \codeinline{sim.state}: Defines the names of the physical variables being
solved for and their spectral equivalents, along with all required variable
transformations.
%
Also includes high-level objects, aptly named \codeinline{sim.state.state\_phys}
and \codeinline{sim.state.state\_spect} to hold the arrays.
%
\1 \codeinline{sim.time\_stepping}: Generic numeric time-integration object
which dynamically determines the time-step using the CFL criterion for specific
solver and advances the state variables using Runge-Kutta method of order 2 or
4.
%
\1 \codeinline{sim.init\_fields}: Used only once to initialize all state variables
from a previously generated output file or with simple kinds of flow structures,
for example a dipole vortex, base flow with constant value for all gridpoints,
grid of vortices, narrow-band noise, etc.
%
\1 \codeinline{sim.forcing}: Initialized only when
\codeinline{params.forcing.enable} is set as \codeinline{True} and it computes
the forcing variables, which is added on to right-hand-side of the equations
being solved.
%
\1 \codeinline{sim.preprocess}: Adjusts solver parameters such as the magnitude
of initialized fields, viscosity value and forcing rates after all other
subclasses are initialized, and just before the time-integration starts.
\end{outline}
Such a modular organization of the solver's features has several advantages.
The most obvious one, will be the ease of maintaining the code base. As opposed
to a monolithic solver, modular codes are well separated and leads to less
conflicts while merging changes from other developers. Secondly, with this
implementation, it is possible to extend or replace a particular set of
features by inheriting or defining a new class.
Modular codes can be difficult to navigate and understand the connection between
objects and the classes in static languages. It is much less a problem with
Python where one can easily decipher this information from object attributes or
using IPython's dynamic object information feature\footnote{See
\href{https://ipython.readthedocs.io/en/stable/interactive/%
reference.html\#dynamic-object-information}{ipython.readthedocs.io}.}.
%
Now, \fluidpack{sim} goes one step further and one can effortlessly print the
\codeinline{sim.info\_solver} object in the Python / IPython console, to get this
information. A truncated example of the output is shown below.
\begin{minted}[fontsize=\footnotesize]{python}
>>> sim.info_solver
<fluidsim.solvers.ns3d.solver.InfoSolverNS3D object at 0x7fb6278263c8>
<solver class_name="Simul" module_name="fluidsim.solvers.ns3d.solver"
short_name="ns3d">
<classes>
<Operators class_name="OperatorsPseudoSpectral3D"
module_name="fluidsim.operators.operators3d"/>
<State class_name="StateNS3D" keys_computable="['rotz']"
keys_linear_eigenmodes="['rot_fft']" keys_phys_needed="['vx', 'vy',
'vz']" keys_state_phys="['vx', 'vy', 'vz']"
keys_state_spect="['vx_fft', 'vy_fft', 'vz_fft']"
module_name="fluidsim.solvers.ns3d.state"/>
<TimeStepping class_name="TimeSteppingPseudoSpectralNS3D"
module_name="fluidsim.solvers.ns3d.time_stepping"/>
<InitFields class_name="InitFieldsNS3D"
module_name="fluidsim.solvers.ns3d.init_fields">
<classes>
<from_file class_name="InitFieldsFromFile"
module_name="fluidsim.base.init_fields"/>
<from_simul class_name="InitFieldsFromSimul"
module_name="fluidsim.base.init_fields"/>
<in_script class_name="InitFieldsInScript"
module_name="fluidsim.base.init_fields"/>
<constant class_name="InitFieldsConstant"
module_name="fluidsim.base.init_fields"/>
<!--truncated output-->
</classes>
</solver>
\end{minted}
Note that while the 3D Navier-Stokes solver relies on some generic \emph{base
classes}, such as \codeinline{OperatorsPseudoSpectral3D} and
\codeinline{TimeSteppingPseudoSpectral}, shared with other solvers; for other
purposes there are \emph{solver-specific classes}. The latter is often inherited
from the base classes in \codeinline{fluidsim.base} or classes available in other
solvers --- this made possible by the use of an object-oriented approach. This is
particularly advantageous while extending existing features or creating new
solvers to use
\href{https://docs.python.org/3/tutorial/classes.html#inheritance}{%
class inheritance}.
\subsubsection{Performance}
Performance of a code can be carefully measured by three different program
analysis methods: profiling, micro-benchmarking and scalability analysis.
%
\href{https://en.wikipedia.org/wiki/Profiling_(computer_programming)}{Profiling}
traces various function calls and records the cumulative time consumed and the
number of calls for each function. Through profiling, we shed light on what
part of the code consumes the lion's share of the computational time and
observe the impact as number of processes and MPI communications increase.
%
\href{https://en.wiktionary.org/wiki/microbenchmark}{Micro-benchmarking} is the
process of timing a specific portion of the code and comparing different
implementations. The aspect is addressed in greater detail in the companion
paper on \fluidpack{fft} \citep{fluidfft}.
%
On the other hand,
\href{https://en.wikipedia.org/wiki/Scalability#Performance_tuning_versus_hardware_scalability}{a
scalability study} measures how the speed of the code improves when it is deployed
with multiple CPUs operating in parallel. To do this, the walltime required by
the whole code is measured to complete a certain number of iterations, given a
problem size.
%
Finally, performance can also be studied by comparing different codes on
representative problems. Since such comparisons should not focus only on
performance, we present a comparison study in a separate section.
\begin{table}[h]
\centering
\begin{tabular}{l p{7cm}}
\toprule
Cluster & Beskow (Cray XC40 system with Aries interconnect) \\
CPU & Intel Xeon CPU E5--2695v4, 2.1GHz \\
Operating System & SUSE Linux Enterprise Server 11, Linux Kernel 3.0.101\\
No.\ of cores per nodes used & 32 \\
Maximum no.\ of nodes used & 32 (2D cases), 256 (3D cases) \\
Compilers & CPython 3.6.5, Intel C++ Compiler (\pack{icpc}) 18.0.0 \\
Python packages & \fluidpack{dyn} 0.2.3, \fluidpack{fft} 0.2.3,
\fluidpack{sim} 0.2.1, \pack{numpy} (OpenBLAS) 1.14.2, \pack{Cython} 0.28.1,
\pack{mpi4py} 3.0.0, \pack{pythran} 0.8.5 \\
\bottomrule
\end{tabular}
\caption{Specifications of the supercomputing cluster and software used for profiling and benchmarking.}
\label{tab:specs}
\end{table}
The profiling and scaling tests were conducted in a supercomputing cluster of
the Swedish National Infrastructure for Computing (SNIC) namely Beskow (PDC,
Stockholm). Relevant details regarding software and hardware which affect
performance are summarised in Table~\ref{tab:specs}. Note that no
hyperthreading was used while carrying out the studies.
%
The code comparisons were made on a smaller machine.
%
Results from the three analyses are systematically studied in the following
sections.
\paragraphbf{Profiling}
It is straightforward to perform profiling with the help of the
\href{https://docs.python.org/3/library/profile.html}{cProfile} module, available
in the Python standard library.
%
For \fluidpack{sim}, this module has been conveniently packaged into a command
line utility, \codeinline{fluidsim-profile}.
%
Here, we have analyzed both 2D and 3D Navier-Stokes solvers in Beskow, and
plotted the results in Fig.~\ref{fig:profile2d} and Fig.~\ref{fig:profile3d}
respectively. Functions which consume less than $2\%$ of the total time are
displayed within a single category, \emph{other}.
\begin{figure}[htp]
\centering
\includegraphics[width=\linewidth]{tmp/fig_profile2d}
\caption{Profiling analysis of the 2D Navier-Stokes
(\codeinline{fluidsim.solvers.ns2d}) solver using a grid sized $1024\times1024$
(a) in sequential with \codeinline{fft2d.with\_fftw1d} operator and (b) with 8
processes with \codeinline{fft2d.mpi\_with\_fftwmpi2d}
operator.}\label{fig:profile2d}
\end{figure}
In Fig.~\ref{fig:profile2d} both sequential and parallel profiles of the 2D
Navier-Stokes solver shows that majority of time is spent in inverse and forward
FFT calls (\codeinline{ifft\_as\_arg} and \codeinline{fft\_as\_arg}). For the
sequential case, approximately 0.14\% of the time is spent in pure Python
functions, i.e.\ functions not built using \pack{Cython} and \pack{Pythran}.
%
\pack{Cython} extensions are responsible for interfacing with FFT operators and
also for the time-step algorithm. \pack{Pythran} extensions are used to translate
most of the linear algebra operations into optimized, statically compiled
extensions.
%
We also see that only 1.3\% of the time is not spent in the main six functions
(category \emph{other} in the figure).
%
With 8 processes deployed in parallel, time spent in pure Python function
increases to 1.1\% of the total time.
%
These results show that during the optimization process, we have to focus on a
very small number of functions.
\begin{figure}[htp]
\centering
\includegraphics[width=\linewidth]{tmp/fig_profile3d}
\caption{Profiling analysis of the 3D Navier-Stokes
(\codeinline{fluidsim.solvers.ns3d}) solver.
%
Top row: grid sized $128\times128\times128$ solved (a) sequentially using
\codeinline{fft3d.with\_fftw3d} operator and (b) with 8 processes using
\codeinline{fft3d.mpi\_with\_fftwmpi3d} operator.
%
Bottom row: grid sized $512\times512\times512$ using
\codeinline{fft3d.mpi\_with\_fftwmpi3d} operator (c) with 2 processes and
(d) with 128 processes.}
\label{fig:profile3d}
\end{figure}
From Fig.~\ref{fig:profile3d} it can be shown that, for the 3D Navier-Stokes
solver for all cases majority of time is attributed to FFT calls. The overall time
spent in pure Python function range from 0.001\% for $512^3$ grid points and 2
processes to 0.46\% for $128^3$ grid points and 8 processes.
%
This percentage tends to increase with the number of processes used since the real
calculation done in compiled extensions take less time.
%
This percentage is also higher for the coarser resolution for the same reason.
%
However, the time in pure Python remains for all cases largely negligible compared
to the time spent in compiled extensions.
\paragraphbf{Scalability}
Scalability can be quantified by speedup $S$ which is a measure of the time taken
to complete $N$ iterations for different number of processes, $n_p$.
%We shall
%refrain from comparing sequential runs in this context, since the operators used
%for the sequential mode differ from the parallel mode, especially the FFT
%class.
Speedup is formally defined here as:
\begin{equation}
S_\alpha(n_p) = \frac
{[\mathrm{Time\ elapsed\ for\ } N \mathrm{\ iterations\ with\ }n_{p,\min}\mathrm{\ processes}]_{\mathrm{fastest}}
\times n_{p,\min}}
{[\mathrm{Time\ elapsed\ for\ } N \mathrm{\ iterations\ with\ } n_p \mathrm{\
processes}]_\alpha}
\label{eq:speedup}
\end{equation}
where $n_{p,\min}$ is the minimum number of processes employed for a specific
array size and hardware, $\alpha$ denotes the FFT class used and ``fastest''
corresponds to the fastest result among various FFT classes.
%
In addition to number of processes, there is another important parameter, which
is the size of the problem; in other words, the number of grid points used to
discretize the problem at hand.
%
In \emph{strong scaling} analysis, we keep the global grid-size fixed and
increase the number of processes.
Ideally, this should yield a speedup which increases linearly with number of
processes. Realistically, as the number of processes increase, so does the
number of MPI communications, contributing to some latency in the overall time
spent and thus resulting in less than ideal performance.
%
Also, as shown by profiling in the previous section, majority of the time is
consumed in making forward- and inverse-FFT calls, an inherent bottleneck of
the pseudo-spectral approach. The FFT function calls are the source of most of
the MPI calls during runtime, limiting the parallelism.
\paragraphbf{2D benchmarks}\label{sec:bench2d}
The Navier-Stokes 2D solver (\codeinline{fluidsim.solvers.ns2d}) solving an
initial value problem (with random fields) was chosen as the test case for strong
scaling analysis here. The physical grid was discretized
%
with $1024\times1024$ and $2048\times2048$ points.
%
Fourth-order Runge-Kutta (RK4) method with a constant time-step was used for
time-integration.
%
File input-output and the forcing term has been disabled so as to measure the
performance accurately. The test case is then executed for 20 iterations.
%for one or more passes
The time elapsed was measured just before and after the
\codeinline{sim.time\_stepping.start()} function call, which was then utilized
to calculate the average walltime per iteration and speedup.
%
This process is repeated for two different FFT classes provided by
\fluidpack{fft}, namely \codeinline{fft2d.mpi\_with\_fftw1d} and
\codeinline{fft2d.mpi\_with\_fftwmpi2d}.
Although the FFT classes are different for sequential and parallel
runs, for the sake of comparison, we have tabulated the actual
walltime elapsed in Table~\ref{table:seqpar}. Clearly due to the differences
in the implementations, for \codeinline{fftw1d}-based classes it takes a
performance hit with parallelization, while for classes based on
\codeinline{fftw2d} the performance improves.
\begin{table}
\centering
\begin{tabular}{lrlr}
\hline
FFT class & Time ($n_p=1$) & FFT class & Time ($n_p=2$) \\
\hline
\codeinline{fft2d.with\_fftw1d} & 6.63 &
\codeinline{fft2d.mpi\_with\_fftw1d} & 7.63 \\
\codeinline{fft2d.with\_fftw2d} & 5.59 &
\codeinline{fft2d.mpi\_with\_fftwmpi2d} & 3.91 \\
\hline
\end{tabular}
\caption{Elapsed times (in seconds) for twenty time steps of 1024 x 1024 case
with the 2D Navier-Stokes 2D (\codeinline{fluidsim.solvers.ns2d}) solver.}
\label{table:seqpar}
\end{table}
\begin{figure}[htp]
\centering
\includegraphics[width=\linewidth]{tmp/fig_bench_strong2d}
\caption{Strong scaling benchmarks of the 2D Navier-Stokes
(\codeinline{fluidsim.solvers.ns2d}) solver. The number of cores $n_p$ goes from 2
to $2^{10} = 1024$. Crosses and dots correspond to $1024\times1024$ and
$2048\times2048$ grid points, respectively.}
\label{fig:strong2d}
\end{figure}
In Fig.~\ref{fig:strong2d} we have analyzed the strong scaling speedup $S$ and
walltime per iteration. The fastest result for a particular case is assigned
the value $S=n_p$ as mentioned earlier in Eq.~\ref{eq:speedup}. Ideal speedup
is indicated with a dotted black line and it varies linearly with number of
processes. We notice that for the $1024\times1024$ case there is an assured
increasing trend in speedup for intra-nodes computation.
%
Nevertheless, when this test case is solved with over a node ($n_p > 32$); the
speedup drops abruptly. While it may be argued that the speedup is impacted by
the cost of inter-node MPI communications via network interfaces, that is not
the case here. This is shown by speedup for the $2048\times2048$ case, where
speedup increases from $n_p = 32$ to $64$, after which it drops again. It is thus
important to remember that a decisive factor in pseudo-spectral simulations is
the choice of the grid size, both global and local (per-process), and for certain
shapes the FFT calls can be exceptionally fast or vice-versa.
%MPI operations are especially slower in inter-node computation, since nodes
%communicate over network interfaces.
%
From the above results, it may also be inferred that superior performance is
achieved through the use of \codeinline{fft2d.mpi\_with\_fftwmpi2d} as the FFT
method. The \codeinline{fft2d.mpi\_with\_fftw1d} method serves as a fallback
option when either FFTW library is not compiled using MPI bindings or the
domain decomposition results in zero-sized arrays. The latter is a known issue
with the current version of \fluidpack{fft} and \fluidpack{sim} and requires
further development\footnote{See
\href{https://bitbucket.org/fluiddyn/fluidfft/issues/4}{FluidFFT issue 4}}.
This is the reason why the scalability of
\codeinline{fft2d.mpi\_with\_fftwmpi2d} seems to be restricted
in Fig.~\ref{fig:strong2d}, whereas in reality it can be scalable as
\codeinline{fft2d.mpi\_with\_fftw1d}.
To the right of Fig.~\ref{fig:strong2d}, the real-time or walltime required to
perform a single iteration in seconds is found to vary inversely proportional
to the number of processes, $n_p$. The walltime per iteration ranges from
$0.195$ to $0.023$ seconds for the $1024\times1024$ case, and from
$0.128$ to $0.051$ seconds for the $2048\times2048$ case. Thus it is indeed
feasible and scalable to use this particular solver.
\paragraphbf{3D benchmarks}
Using a similar process as described in the previous section,
%~\ref{sec:bench2d},
the Navier-Stokes 3D solver (\codeinline{fluidsim.solvers.ns3d}) is chosen to
perform 3D benchmarks.
%
As demonstrated in Fig.~\ref{fig:strong3d_beskow} two physical global grids
with $128\times128\times128$ and $1024\times1024\times1024$ are used to
discretize the domain.
%
Other parameters are identical to what was described for the 2D benchmarks.
Through \fluidpack{fft}, this solver has four FFT methods at disposal:
\begin{itemize}
\item \codeinline{fft3d.mpi\_with\_fftw1d}
\item \codeinline{fft3d.mpi\_with\_fftwmpi3d}
\item \codeinline{fft3d.mpi\_with\_p3dfft}
\item \codeinline{fft3d.mpi\_with\_pfft}
\end{itemize}
The first two methods implements a 1D or \emph{slab} decomposition, i.e.\ the
processes are distributed over one index of a 3D array. And the last two
methods implement a 2D or \emph{pencil} decomposition. For the sake of clarity,
we have restricted this analysis to the fastest FFT method of the two types in
this configuration, namely \codeinline{fft3d.mpi\_with\_fftwmpi3d} and
\codeinline{fft3d.mpi\_with\_p3dfft}. A more comprehensive study of the
performance of these FFT methods can be found in \citet{fluidfft}.
\begin{figure}[htp]
\centering
\includegraphics[width=\linewidth]{tmp/fig_bench_strong3d}
\caption{Strong scaling benchmarks of the 3D Navier-Stokes
(\codeinline{fluidsim.solvers.ns3d}) solver in Beskow. The number of cores $n_p$
goes from $2^5 = 32$ to $2^{13} = 8192$. Crosses and dots correspond to $128^3$
and $1024^3$ grid points, respectively.}
\label{fig:strong3d_beskow}
\end{figure}
In Fig.~\ref{fig:strong3d_beskow} the strong scaling speedup and walltime per
iteration are plotted from 3D benchmarks in Beskow.
%
The analysis here is limited to single-node and inter-node performance.
%
For both grid-sizes analyzed here, the \codeinline{fft3d.mpi\_with\_fftwmpi3d}
method is the fastest of all methods but limited in scalability because of the
1D domain decomposition strategy. To utilize a large number of processors, one
requires the 2D decomposition approach. Also, note that for the
$1024\times1024\times1024$ case, a single-node measurement was not possible as
the size of the arrays required to run the solvers exceeds the available
memory. For the same case, a speedup reasonably close to linear variation is
observed with \codeinline{fft3d.mpi\_with\_p3dfft}.
%
It is also shown that the walltime per iteration improved from
%
$0.083$ to $0.027$ seconds for the $128\times128\times128$ case, and from
$31.078$ to $2.175$ seconds for the $1024\times1024\times1024$ case.
\subsection{CFD pseudo-spectral code comparisons}
%TODO-DONE: Compare profiling of Dedalus SpectralDNS, NS3D
% See the file notes_compare_codes.md
% Global comparison, not only performance (which is only one aspect for a CFD
% code)
% First broad description of the three codes
As a general CFD framework, \fluidpack{sim} could be compared to OpenFOAM (a CFD
framework based on finite-volume methods).
%
However, in contrast to OpenFOAM, the current version of \fluidpack{sim} is highly
specialized in pseudo-spectral Fourier methods and it is not adapted for
industrial CFD.
In this subsection, we compare \fluidpack{sim} with three other open-source CFD
pseudo-spectral codes\footnote{For the sake of conciseness, we limit this
comparison to only four codes. We have also found the Julia code
\href{https://github.com/FourierFlows/FourierFlows.jl}{FourierFlows.jl} to
demonstrate interesting performance for 2D sequential runs, but without support
for 3D cases and MPI parallelization.}:
\begin{itemize}
\item \href{http://dedalus-project.org/}{Dedalus} \citep{burns_dedalus} is ``a
flexible framework for spectrally solving differential equations''. It is very
versatile and the user describes the problem to be solved symbolically.
%
This approach is very different than the one of \fluidpack{sim}, where the
equations are described with simple \Numpy code. There is no equivalent of the
\fluidpack{sim} concept of a ``solver'', i.e.\ a class corresponding to a set of
equations with specialized outputs (with the corresponding plotting methods). To
run a simulation with Dedalus, one has to describe the problem using mathematical
equations. This can be very convenient because it is very versatile and it is not
necessary to understand how Dedalus works to define a new problem. However, this
approach has also drawbacks:
\begin{itemize}
\item Even for very standard problems, one needs to describe the problem in the
launching script.
\item There is a potentially long initialization phase during which Dedalus
processes the user input and prepares the ``solver''.
\item Even when a user knows how to define a problem symbolically, it is not
simple to understand how the problem is solved by Dedalus and how to interact with
the program with Python.
\item Since solvers are not implemented out-of-the-box in Dedalus,
specialized forcing scheme or outputs are absent. For example, the user has to
implement the computation, saving and plotting of standard outputs like energy
spectra.
\end{itemize}
\item \href{https://github.com/spectralDNS/spectralDNS}{SpectralDNS}
\citep{mortensen_spectraldns2016} is a ``high-performance pseudo-spectral
Navier-Stokes DNS solver for triply periodic domains. The most notable feature of
this solver is that it is written entirely in Python using \Numpy, MPI
for Python (\pack{mpi4py}) and \pack{pyFFTW}.''
Therefore, SpectralDNS is technically very similar to \fluidpack{sim}.
%
Some differences are that SpectralDNS has no object oriented API, and that the user
has to define output and forcing in the launching script\footnote{See
\href{https://github.com/spectralDNS/spectralDNS/tree/master/demo}{the demo
scripts of SpectralDNS}.}, which are thus usually much longer than for
\fluidpack{sim}.
%
Moreover, the parallel Fourier transforms are done using the Python package
\href{https://github.com/spectralDNS/mpiFFT4py}{\pack{mpiFFT4py}}, which is only
able to use the FFTW library and not other libraries as with \fluidpack{fft}
\citep[][]{fluidfft}.
\item \href{https://bitbucket.org/paugier/ns3d}{NS3D} is a highly efficient
pseudo-spectral Fortran code.
%
It has been written in the laboratory
\href{https://www.ladhyx.polytechnique.fr}{LadHyX} and used for several studies
involving simulations (in 3D and in 2D) of the Navier-Stokes equations under the
Boussinesq approximation with stratification and system
rotation~\citep[][]{DeloncleBillantChomaz2008}.It is in particular specialized in
stability studies~\citep[][]{BillantDeloncleChomazOtheguy2010}.
%
NS3D has been highly optimized and it is very efficient for sequential and
parallel simulations (using MPI and OpenMP). However, the parallelization is
limited to 1D decomposition for the FFT~\citep[][]{fluidfft}.
%
Another weakness compared to \fluidpack{sim} is that NS3D uses simple binary files
instead of HDF5 and NetCDF4 files for \fluidpack{sim}. Therefore, visualization
programs like Paraview or Visit cannot load NS3D data.
As with many Fortran codes, Bash and Matlab are used for launching and
post-processing, respectively.
%
In terms of user experience, this can be a drawback compared to the coherent
framework \fluidpack{sim} for which the user works only with Python.
\end{itemize}
\paragraph{Description of the test.} We have formulated the benchmarks tests
that all the aforementioned codes solve the Navier-Stokes equations expressed
in the
\begin{itemize}
\item stream function-vorticity ($\psi-\zeta$) formulation for the
bidimensional simulations,
\begin{align*}
\p_t \hat\zeta &= -(\widehat{\mathbf{u} \cdot \nabla \zeta}) - \nu
\kappa^2 \hat \zeta \\
-\kappa^2 \hat \psi &= \hat \zeta \\
\mathbf{\hat u} &= i\mathbf{k} \times \mathbf{\hat e}_{k_z} \hat \psi
\end{align*}
where the hat ($\hat{}$) notation signifies spectral quantities,
$\mathbf{u}$, $\nu$ is the viscosity, and $\mathbf{k}$ represents the
velocity and wavenumber vectors, and $\kappa = |\mathbf{k}|$ (See
\href{https://fluidsim.readthedocs.io/en/latest/generated/fluidsim.solvers.ns2d.solver.html}{\codeinline{fluidsim.solvers.ns2d}
solver documentation}).
\item rotational form for the tridimensional simulations,
\begin{align*}
\p_t \mathbf{\hat u} = P_\perp \widehat{(\mathbf{u} \times \boldsymbol{\zeta})}
- \nu \kappa ^ 2 \mathbf{u}
\end{align*}
where the $P_\perp$ operator projects the vector field onto a divergence free
plane (see
\href{https://fluidsim.readthedocs.io/en/latest/generated/fluidsim.solvers.ns3d.solver.html}{\codeinline{fluidsim.solvers.ns3d}
solver documentation} and \citet{canuto_algorithms_1988}) and is equivalent
to a pressure correction step.
\end{itemize}
We compare the code with a very simple and standard task, running a solver for