-
Notifications
You must be signed in to change notification settings - Fork 0
Home
A Range-Limited Electrostatic N-Particle Simulation.
Usage: nParticleSim [OPTIONS]
Options:
--mode Select Mode {1,2,3}
--cutoff_radius Enter the cutoff radius (1e-10 m)
--input Enter the file path for csv (particles.csv)
--num_threads Enter number of threads in Mode 2 and 3 (for each process)
--leader Enter number of leader process in Mode 3
e.g
./nParticleSim --mode=1 --cutoff_radius=45000 --input=../dataset/particles.csv
./nParticleSim --mode=2 --cutoff_radius=47500 --input=../dataset/particles.csv --num_threads=50
./nParticleSim --mode=3 --cutoff_radius=47500 --input=../dataset/particles.csv --num_threads=50 --leader=10
This implementation is entirely serial. No multithreading, just the approximation method
applied to compute the signed scalar force sums on every particle.
Input parameters:
./nParticleSim --mode=1 --cutoff_radius={%d} --input=../dataset/particles.csv
In this implementation, I use the
std::thread
execution model to create multiple threads and divide the computation among the threads. I divide the dataset among the threads at the point in time when the thread is created, so that once it finishes its given portion of work, it returns. The work is as evenly as possible divided among the threads.
Input parameters:
./nParticleSim --mode=2 --cutoff_radius={%d} --input=../dataset/particles.csv --num_threads={%d}
--cutoff_radius=45000
--cutoff_radius=45000
In this implementation, you will begin by creating leader processes using MPI. Each leader must be given an equal partition of the dataset. Each leader creates a pool of worker threads in the form of Pthreads/threads. Each leader’s partition of the data must be further partitioned into smaller chunks and placed into a queue that can be accessed by all of its worker threads. Worker threads must take one small chunk of data at a time, execute the necessary computation, and then return to the queue to take more work. Threads only return once the queue is empty and all work is done. The number of leader processes and worker threads and the cutoff radius must be input parameters for this mode.
System: macOS Sonoma Version 14.3.1
Chip: Apple M3 Max
Memory: 48 GB
Core: 16-core CPU and 40-core GPU (400GB/s memory bandwidth)