Parallel GPU Optimization for Fluid Dynamic Solver

An optimized fluid solver originally implemented in C/C++ without parallelization.
The code was restructured to leverage GPU acceleration and other performance improvements, reducing runtime from several weeks to just a few seconds.

📄 Supporting Report

At the root of this repository, there is a PDF file containing a detailed report in Portuguese.
The report explains and justifies the entire GPU optimization process, including the modifications made to the original C/C++ code, performance improvements, and the testing methodology.
It also includes instructions on how to implement an OpenMP optimization on the original code.
The original code is available at 3DFluid GitHub repository.

📂 Where is the code?

The main implementation is located in the .cu files:

main.cu – entry point of the program
fluid_solver.cu – GPU-optimized fluid solver
EventManager.cpp – event handling utilities

⚙️ Requirements

To build and run the solver you will need:

gcc (GNU Compiler Collection)
nvcc (NVIDIA CUDA Compiler)
An NVIDIA GPU compatible with your chosen CUDA architecture

🚀 How to Build and Run

Inside the src/ directory there is a Makefile.
Then you can use make to compile, this will return executable under the name of fluid_sim. By default, it compiles the solver with the following command:

nvcc -O3 -arch=sm_35 -Wno-deprecated-gpu-targets -use_fast_math -maxrregcount=32 main.cu fluid_solver.cu EventManager.cpp -o fluid_sim

⚠️ Note: The flags can be modified depending on your GPU model. The provided configuration is tuned for NVIDIA Kepler GPUs.

Run locally

On Linux, you can run:

./fluid_sim

If you want to check execution time:

time ./fluid_sim

Run on SLURM

The Makefile also includes a target to submit jobs via SLURM. Use:

make run

This will call the run.sh script with sbatch.

🧹 Cleaning Up

To remove the compiled binary and clean the project, run:

make clean

This will delete the fluid_sim executable and any temporary build files.

📌 Notes

The project was migrated from a sequential CPU version to a massively parallel GPU version.
Performance gain: weeks → seconds depending on the simulation size and GPU hardware.
CUDA compiler flags (-O3 -arch=sm_35 -use_fast_math -maxrregcount=32) can be adjusted for different GPU architectures to achieve optimal performance.
The project includes a Makefile that simplifies compilation and execution, including support for SLURM job submission.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
src		src
README.md		README.md
WA_P1-7.pdf		WA_P1-7.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Parallel GPU Optimization for Fluid Dynamic Solver

📄 Supporting Report

📂 Where is the code?

⚙️ Requirements

🚀 How to Build and Run

Run locally

Run on SLURM

🧹 Cleaning Up

📌 Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Parallel GPU Optimization for Fluid Dynamic Solver

📄 Supporting Report

📂 Where is the code?

⚙️ Requirements

🚀 How to Build and Run

Run locally

Run on SLURM

🧹 Cleaning Up

📌 Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages