Final Project for CS 759

This code numerically approximates solutions to the 1D heat equation. It does so using the Forwards-Time Centered-Space (FTCS) method. I chose to use the FTCS method, since it is so conducive to the fine grain parallelism, or data level parallelism, that CUDA excels at.

This is because the FTCS method results in a recurrence relation which gives the temperature at each point in terms of temperatures at the previous time step. Therefore each CUDA thread can be assigned a specific point, and at each time step the threads can simultaneously calculate the temperatures at their points.

The Project Architecture

The simulation really consists of two parts. The first part is a CUDA program which actually numerically approximates the solution. This code, which is in heatEquation.hpp and main.cu, was written to run on UW-Madison's Euler cluster. The code which performs the numerical approximation runs on a NVIDIA GPU. After this code completes, the results are written to a file.

The second part is a Python script, animate.py, which is used to visualize the solution. The script uses matplotlib, and reads the data generated by the CUDA program to create an animation of the heat diffusing over time. The script was developed to be run on someone's local Linux or Mac machine using Python 2.7.

The last component of this project is the Python script plotTimmings.py. It uses matplotlib to graph the run time of the simulation versus the number of time steps for which the simulation performed calculations.

To make building the CUDA program easier, this project includes a Makefile written to be run on Euler. It also includes a SLURM job submission script, slurmscript, which was used to run the CUDA simulation on Euler.

The CUDA Program

The code which actually carry out the numerical approximation are in heatEquation.hpp. In order to numerically approximate the solution to a specific problem, the user of this code calls the function solveProblemInstance.

solveProblemInstance takes two structs and a file name. The first struct, of type HeatProblem1d, specifies the instance of the heat equation to be solved. That is it gives the length of the rod, the thermal diffusivity of the rod (referred to as alpha), the boundary conditions, and a functor to calculate the initial temperature at any point on the rod. The functor was inspired by Thrust, and the struct type is templated to allow for any functor (though it's operator() must take a float and return a float).

The second struct is of type SimulationParams1D, and tells the simulation functions how the numerical approximation should to be carried out. It contains the difference between points in time, the difference between points in space, and the number of points in time for which the simulation will perform calculations, excluding time t = 0. It also includes the number of time steps between lines of the output file.

To give an explicit example of how the time based simulation parameters interact, let's say that difference between points in time is 2, the number of points in time for which the simulation will perform calculations (excluding time = 0) is 86, and periodOfRecordings is 3. The program will perform calculations at times time = 0, time = 2, time = 4, ..., time = 172. The program would write temperatures to the output file for time = 0, time = 6, time = 12, ..., time = 168 (since the number of time intervals is not evenly divisible by 3, the temperatures at the last time step aren't written to the file).

I chose to pass this information in as structs, because I felt with so many parameters (9 to be exact) setting struct members was much more readable then long parameter lists. It also made it easy for me to pass the information on to helper functions.

I chose to have two structs since I thought one might want to solve the same problem with multiple different simulation parameters (e.g. time step sizes), and decoupling the problem from the simulation parameters would better facilitate this.

main.cu

main.cu currently does two things. It numerically approximates a problem which I found on page 143 of "The Mathematics of Diffusion" by J. Crank. Crank presents the numerical solution to this problem, as was obtained using the FTCS method, in table 8.1 on page 389 (though the book is from 1975 and does not include code). I used this problem as a quick "sanity check", since I could compare my results to those presented by Crank.

main.cu also currently runs some timing code. It finds a solution to the same problem, with an increasingly great number of time steps. It outputs the inclusive time taken each time, as well as the number of iterations, to stdout. When main.cu is run using slurmscript, this output ends up in a file called sbatch.out, which can then be used to generate the timing plot using plotTimmings.py.

Constraints On Inputs

In the following discussion I refer to UINT_MAX which is defined in , and is the largest value which can represented by an unsigned int.

The number of iterations must be strictly less than UINT_MAX.
The ceiling of the length of the rod divided by delta X must be strictly less than 1024.
The number of temperatures produced in the output of a simulation must be less than or equal to UINT_MAX + 1.
r, or (deltaT * alpha) / (deltaX^2), must be less than or equal to 1/2.

Areas for Improvement

In the function solveProblemInstanceDevice, I keep track of the temperatures at the previous time step in an array in global memory. Clearly this results in way too many trips to global memory, and the array should be in shared memory. I didn't use shared memory in the initial development to simplify things, and the course (as well as my access to Euler) ended before I got around to fixing it.

Also, it would be good to eliminate the constraint on the number of discrete points used in the simulation. This could be done by having some threads be assigned more than one point if the number of points is greater than 1024. Again, the course and my access to Euler ended before I got around to eliminating that restriction.

Building and Running

Generating the Data on Euler

Clone this repository on Euler.
Run make in the top level directory of this project (that is to say the the directory that this README is in).
Run mkdir simData
The generated executable is named generate_data. This is the CUDA program which will be run on Euler to generate the data.
Run ./generate_data either via a slurm script or in interactive mode
The generated data is now in a folder called simData
In order to copy this data to your local machine you will have to compress it.
Run zip -r simData simData
After the compression completes, there will be a file called simData.zip
Copy simData.zip and animate.py to your local machine. I trust you already have some way you usually copy files back and forth.

IMPORTANT: Even if you use sshfs or sftpnetdrive, you will still have to manually move the compressed file over to your local machine. You can drag and drop it if you'd like, you can use scp if you'd like, but simData.zip needs to be on your local machine's hard drive.

Running The Animation

Unfortunately the animation requires matplotlib, which can't be used on Euler at the moment. You will need to install matplotlib if you don't already have it.

Installing Matplotlib

The best way to install matplotlib is to install the Anaconda package manager.

Go to https://www.anaconda.com/download/
Download the Python 2.7 version of Anaconda. This comes with everything you need.

Running the Animation

cd into the directory with simData.zip and animate.py
Unzip simData.zip
animate.py takes a file name as its first argument. For its second argument you can optionally give it the time in milliseconds that the animation should pause between frames (the default value is 200). The minimum value for the second argument is 1.
For each file aFile in the unziped simData run python animate.py aFile. If you want to speed things up or slow them down you can run python animate.py aFile timeStep.

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
animate.py		animate.py
cuda_stopwatch.hpp		cuda_stopwatch.hpp
errorMacros.hpp		errorMacros.hpp
heatEquation.hpp		heatEquation.hpp
main.cu		main.cu
notes.md		notes.md
plotTimmings.py		plotTimmings.py
slurmscript		slurmscript

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Final Project for CS 759

The Project Architecture

The CUDA Program

main.cu

Constraints On Inputs

Areas for Improvement

Building and Running

Generating the Data on Euler

Running The Animation

Installing Matplotlib

Running the Animation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

devinsm/cs759Final

Folders and files

Latest commit

History

Repository files navigation

Final Project for CS 759

The Project Architecture

The CUDA Program

main.cu

Constraints On Inputs

Areas for Improvement

Building and Running

Generating the Data on Euler

Running The Animation

Installing Matplotlib

Running the Animation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages