Releases: ProjectPhysX/FluidX3D
FluidX3D v3.2 (fast force/torque summation)
Thank you for using FluidX3D! Update v3.2 brings the much requested GPU-accelerated force/torque summation:
Improvements
- implemented GPU-accelerated force/torque summation (~20x faster than CPU-multithreaded implementation before)
- simplified calculating object force/torque in setups; before:
now:
lbm.voxelize_mesh_on_device(mesh, TYPE_S|TYPE_X); const float3 lbm_com = lbm.calculate_object_center_of_mass(TYPE_S|TYPE_X); // ... lbm.calculate_force_on_boundaries(); lbm.F.read_from_device(); // having to copy entire lbm.F from GPU VRAM to CPU RAM was slow!! const float3 lbm_force = lbm.calculate_force_on_object(TYPE_S|TYPE_X); // slow CPU-multithreaded summation const float3 lbm_torque = lbm.calculate_torque_on_object(lbm_com, TYPE_S|TYPE_X); // slow CPU-multithreaded summation
lbm.voxelize_mesh_on_device(mesh, TYPE_S|TYPE_X); const float3 lbm_com = lbm.object_center_of_mass(TYPE_S|TYPE_X); // ... const float3 lbm_force = lbm.object_force(TYPE_S|TYPE_X); // fast GPU-accelerated summation, copy only result to CPU const float3 lbm_torque = lbm.object_torque(lbm_com, TYPE_S|TYPE_X); // fast GPU-accelerated summation, copy only result to CPU
- improved coloring in
VIS_FIELD
/ray_grid_traverse_sum()
- updated OpenCL-Wrapper now compiles OpenCL C code with
-cl-std=CL3.0
if available
Bug fixes
- fixed compiling on macOS with new OpenCL headers
Have fun with the software!
-- Moritz
Here a showcase of the improved coloring in VIS_FIELD
/ray_grid_traverse_sum()
:
FluidX3D v3.1 (more bug fixes)
Thank you for using FluidX3D! Update v3.1 brings two critical bug fixes/workarounds and various small improvements under the hood:
Improvements
- faster
enqueueReadBuffer()
on modern CPUs with 64-Byte-alignedhost_buffer
- hardened ray intersection functions against planar ray edge case
- updated OpenCL headers
- better OpenCL device specs detection using vendor ID and Nvidia compute capability
- better VRAM capacity reporting correction for Intel dGPUs
- improved styling of performance mermaid gantt chart in Readme
- added multi-GPU performance mermaid gantt chart in Readme
- updated driver install guides
Bug fixes
- fixed voxelization being broken on some GPUs
- added workaround for compiler bug in Intel CPU Runtime for OpenCL that causes Q-criterion isosurface rendering corruption
- fixed TFlops estimate for Intel Battlemage GPUs
- fixed wrong device name reporting for AMD GPUs (unlike every sane GPU vendor they don't report device name as
CL_DEVICE_NAME
but needCL_DEVICE_BOARD_NAME_AMD
extension instead)
Have fun with the software!
-- Moritz
FluidX3D v3.0 (larger CPU/iGPU simulations)
A little gift to you all: FluidX3D v3.0 enables 31% larger grid resolution when running on CPUs or iGPUs!
Improvements
- reduced memory footprint on CPUs and iGPU from 72 to 55 Bytes/cell (fused OpenCL host+device buffers for
rho
/u
/flags
), allowing 31% higher resolution in the same RAM capacity - faster hardware-supported and faster fallback emulation atomic floating-point addition for
PARTICLES
extension - hardened
calculate_f_eq()
against bad user input forD2Q9
Bug fixes
- fixed velocity voxelization for overlapping geometry with different velocity
- fixed Remaining Time printout during paused simulation
- fixed CPU/GPU memory printout for CPU/iGPU simulations
- fixed bug that
default_filename()
would fail if there was a.
in the file path
Have fun with the software!
-- Moritz
PS: Here's a little demo of what FluidX3D v3.0 is capable of:
FluidX3D v2.19 (camera splines)
Thank you for using FluidX3D! Update v2.19 adds Catmull-Rom splines for smooth camera movement, and bug fixes:
Improvements
- the camera can now fly along a smooth path through a list of provided keyframe camera placements, using Catmull-Rom splines
- more accurate remaining runtime estimation that includes time spent on rendering
- enabled FP16S memory compression by default
- printed camera placement using key G is now formatted for easier copy/paste
- added benchmark chart in Readme using mermaid gantt chart
- placed memory allocation info during simulation startup at better location
Bug fixes
- fixed threading conflict between
INTERACTIVE_GRAPHICS
andlbm.graphics.write_frame();
- fixed maximum buffer allocation size limit for AMD GPUs and in Intel CPU Runtime for OpenCL
- fixed wrong
Re<Re_max
info printout for 2D simulations - minor fix in
bandwidth_bytes_per_cell_device()
Have fun with the software!
-- Moritz
FluidX3D v2.18 (more bug fixes)
Thank you for using FluidX3D! Update v2.18 brings support for high refresh rate monitors on Linux and bug fixes:
Improvements
- added support for high refresh rate monitors on Linux
- more compact OpenCL Runtime installation scripts in Documentation
- driver/runtime installation instructions will now be printed to console if no OpenCL devices are available
- added domain information to
LBM::write_status()
- added
LBM::index
function foruint3
input parameter
Bug fixes
- fixed that very large simulations sometimes wouldn't render properly by increasing maximum render distance from 10k to 2.1M
- fixed mouse input stuttering at high screen refresh rate on Linux
- fixed graphical artifacts in free surface raytracing on Intel CPU Runtime for OpenCL
- fixed runtime estimation printed in console for setups with multiple
lbm.run(...)
calls - fixed density oscillations in sample setups (too large
lbm_u
) - fixed minor graphical artifacts in
raytrace_phi()
- fixed minor graphical artifacts in
ray_grid_traverse_sum()
- fixed wrong printed time step count on raindrop sample setup
Have fun with the software!
-- Moritz
FluidX3D v2.17 (unlimited domain resolution)
Thank you for using FluidX3D! Update v2.17 removes the limit on 2³² cells per domain and adds new field visualization:
Improvements
- for GPUs/CPUs with >225 GB memory: domains are no longer limited to 4.29 billion (2³², 1624³) grid cells; if more are used, the OpenCL code will automatically compile with 64-bit indexing
- new, faster raytracing-based field visualization for single-GPU simulations (thanks @Snektron for the idea!)
- added GPU Driver and OpenCL Runtime installation instructions to documentation
- refactored
INTERACTIVE_GRAPHICS_ASCII
Bug fixes
- fixed memory leak in destructors of
floatN
,floatNxN
,doubleN
,doubleNxN
(all unused) - made camera movement/rotation/zoom behavior independent of framerate
- fixed that
smart_device_selection()
would print a wrong warning if device reports 0 MHz clock speed
Have fun with the software!
-- Moritz
FluidX3D v2.16 (bug fixes)
I'm doing my part! With the v2.16 update I've put down all remaining known bugs for good. 🖖😎❌🪳❌
WOULD YOU LIKE TO KNOW MORE?
Bug fixes in this release:
- fixed that voxelization failed in Intel OpenCL CPU Runtime due to array out-of-bounds access
- fixed that voxelization did not always produce binary identical results in multi-GPU compared to single-GPU
- fixed that velocity voxelization failed for free surface simulations
- fixed terrible performance on ARM GPUs by macro-replacing fused-multiply-add (
fma
) witha*b+c
- fixed that Y/Z keys were incorrect for
QWERTY
keyboard layout in Linux - fixed that free camera movement speed in help overlay was not updated in stationary image when scrolling
- fixed that cursor would sometimes flicker when scrolling on trackpads with Linux-X11 interactive graphics
- fixed flickering of interactive rendering with multi-GPU when camera is not moved
- fixed missing
XInitThreads()
call that could crash Linux interactive graphics on some systems - fixed z-fighting between
graphics_rasterize_phi()
andgraphics_flags_mc()
kernels
Other improvements:
- simplified 10% faster marching-cubes implementation with 1D interpolation on edges instead of 3D interpolation, allowing to get rid of edge table
- added faster, simplified marching-cubes variant for solid surface rendering where edges are always halfway between grid cells
- refactoring in OpenCL rendering kernels
With GitHub I can track every bug from day it was discovered/fixed back to the day it was first introduced. This allows me to graph the number of open bugs over time, along with a curve weighted by their individual severity (minor 0.25
, low 0.5
, medium 1.0
, high 2.0
, showstopper 4.0
):
Here is the distribution of days open, days till discovery and days till fix. I fixed 56% of bugs on the day of discovery. Notice the bimodal distribution of days open - a clear separation into "easy" and "nasty" bugs.
Lessons learned:
- Since release there was 63 bugs in FluidX3D in total, with at max 41 open bugs at one time. 🖖😱 Now there is 0, at least until I find more. 🖖😎 For reference: FluidX3D is 12.1k lines of code.
- Most bugs were a byproduct of big feature updates, like v2.0 (multi-GPU) and v2.1/v2.2 (voxelization). Of course at the time of introduction I didn't know that bugs slipped through, and I (or users) only discovered them later.
- Only 17% of bugs were found by users, all the others I found myself with rigorous testing. It takes continuous poking around in the code to find these often super rare bugs.
- 30% of bugs were actually bugs in the compiler, driver or operating system that needed a workaround on application side.
- The latest v2.16 release is the best FluidX3D has ever been. The worst, most bugged version by this metric is v2.2. 🖖🤠
Have fun with the software!
-- Moritz
PS: Here's an amusing FluidX3D video from @SLGY, he's doing his part too!
FluidX3D v2.15 (framerate boost)
Thank you for using FluidX3D! Update v2.15 boosts framerate in interactive graphics by 20-70%:
- eliminated one frame memory copy and one clear frame operation in rendering chain
- enabled
g++
compiler optimizations for faster startup and higher rendering framerate
Bug fixes:
- fixed bug in multithreaded sanity checks
- fixed wrong unit conversion for thermal expansion coefficient
- fixed density to pressure conversion in LBM units
- fixed bug that raytracing kernel could lock up simulation
- fixed minor visual artifacts with raytracing
- fixed that console sometimes was not cleared before
INTERACTIVE_GRAPHICS_ASCII
rendering started
Have fun with the software!
-- Moritz
FluidX3D v2.14 (visualization upgrade)
Thank you for using FluidX3D! Update v2.14 brings an upgrade to visualization kernels and eases compiling:
- coloring can now be switched between velocity/density/temperature with key Z
- uniform improved color palettes for velocity/density/temperature visualization
- color scale with automatic unit conversion can now be shown with key H
- slice mode for field visualization now draws fully filled-in slices instead of only lines for velocity vectors
- shading in
VIS_FLAG_SURFACE
andVIS_PHI_RASTERIZE
modes is smoother now make.sh
now automatically detects operating system and X11 support on Linux and only runs FluidX3D if last compilation was successful
Bug fixes:
- fixed compiler warnings on Android
- fixed
make.sh
failing on some systems due to nonstandard interpreter path - fixed that
make
would not compile with multiple cores on some systems
Here is a YouTube video (some screen recordings) to showcase the update, all real-time simulations on an Intel Arc A750:
Have fun with the software!
-- Moritz
FluidX3D v2.13 (improved .vtk export)
Thank you for using FluidX3D! Update v2.13 improves .vtk
export:
- data in exported
.vtk
files is now automatically converted to SI units - ~2x faster
.vtk
export with multithreading - added unit conversion functions for
TEMPERATURE
extension
Bug fixes:
- fixed graphical artifacts with axis-aligned camera in raytracing
- fixed
get_exe_path()
for macOS - fixed X11 multi-monitor issues on Linux
- workaround for Nvidia driver bug:
enqueueFillBuffer
is broken for large buffers on Nvidia GPUs - fixed slow numeric drift issues caused by
-cl-fast-relaxed-math
- fixed wrong Maximum Allocation Size reporting in
LBM::write_status()
- fixed missing scaling of coordinates to SI units in
LBM::write_mesh_to_vtk()
Have fun with the software!
-- Moritz