Releases: bsc-pm/tampi
TAMPI v4.0
Version 4.0, Fri Nov 15, 2024
The 4.0 release introduces TAMPI-OPT, a newly optimized version of the library that ensures only a single thread accesses the MPI interface through delegation techniques. This approach gives access comparable to the highest MPI performance of single-threaded scenarios. Moreover, it introduces several bug fixes and usability and code improvements. Due to the new optimizations, some features have been ultimately dropped, while others will remain unsupported temporarily. For a detailed list check the bullet points below.
General
- Major library optimizations by serializing communications
- Bump required MPI standard to 3.0 or later
- Add support to adjust the completion polling task period dynamically
- Improved collective operations constructors
- Improve allocations of Operations through a scalable allocator
- Improved ALPI symbol handling
- Other bug fixes and code/performance improvements
Unsupported Features
- Temporarily dropped support for Fortran.
- Temporarily dropped support for request-based MPI operations:
MPI_Wait
,MPI_Waitall
,TAMPI_Iwait
,TAMPI_Iwaitall
. - Dropped support for
MPI_Sendrecv
andMPI_Sendrecv_replace
. Every point-to-point and collective operation is supported.
TAMPI v3.0.2
Version 3.0.2, Mon May 6, 2024
The 3.0.2 release introduces bug fixes and minor improvements.
General
- Use
ovni_thread_require
when instrumenting with ovni - Require ovni 1.5.0 or greater
- Fix include and library flags of the detected MPI implementation
- Fix and improve testing scripts
TAMPI v3.0.1
Version 3.0.1, Thu Dec 7, 2023
The 3.0.1 release introduces bug fixes.
General
- Fix memory order in an atomic store
- Update README with information regarding the OpenMP-V runtime
TAMPI v3.0
Version 3.0, Fri Nov 17, 2023
The 3.0 release introduces the use of the generic ALPI tasking interface, bug fixes, and improved usability and programmability. This version also extends the ovni instrumentation to show more information regarding the TAMPI behavior in Paraver traces. This version is compatible with OmpSs-2 2023.11 or later.
General
- Rely on the ALPI tasking interface (OmpSs-2 2023.11 or later)
- Drop support for the Nanos6-specific tasking interface
- Drop support for older versions than OmpSs-2 2023.11
- Remove deprecated
TAMPI_POLLING_FREQUENCY
environment variable - Stop using PMPI interfaces for testing internal requests (e.g.,
PMPI_Test
) - Do not assume the default MPI threading level is
MPI_THREAD_SINGLE
- Load first occurrence of the ALPI tasking interface symbols (
RTLD_DEFAULT
) - Add opt-in mechanism to explicitly initialize TAMPI independently from MPI
- Add opt-in mechanism to disable task-awareness for specific threads
- Refactor and simplify symbol loading
Instrumentation
- Instrument library subsystems with ovni; see the ovni documentation for more information
- Improve ovni library discovery
Building
- Add
--enable-debug
configure option replacing--enable-debug-mode
- Add
--enable-asan
option to enable address sanitizer flags - Deprecate
--enable-debug-mode
option, which will be removed in next versions
Testing
- Improve testing scripts and Makefiles
- Fix CPU binding on SLURM-based tests
- Add testing option
--skip-omp
to skip the execution of OpenMP tests
TAMPI v2.0
Version 2.0, Fri May 26, 2023
The 2.0 release introduces several performance improvements, important bug fixes, and improved usability and programmability. Several environment variables that users can set to change default behavior have been updated. This version also introduces support for the ovni instrumentation to obtain Paraver execution traces.
General
- Introduce
TAMPI_POLLING_PERIOD
replacingTAMPI_POLLING_FREQUENCY
- Deprecate
TAMPI_POLLING_FREQUENCY
and will be removed in next versions - Drop support for OmpSs-2 2020.06; now requiring OmpSs-2 2020.11 or later
- Leverage C++17 standard, which may require newer GCC (such as GCC 7 or later)
- Extend README with a Frequently Asked Questions (FAQ) section
Performance
- Set default polling period (
TAMPI_POLLING_PERIOD
) to 100us, which can improve applications' performance - Fix and improve implementation of custom spinlocks
- Remove use of std::function due to its dynamic memory allocations
Fixes and Code Improvements
- Reduce code duplication between C/C++ and Fortran support
- Fix Fortran interfaces
- Fix and improve testing infrastructure and test codes
- Compile all libraries with -fPIC
Instrumentation
- Add ovni instrumentation to generate Paraver traces for multi-node executions
- Enable ovni instrumentation when
TAMPI_INSTRUMENT=ovni
environment variable - Drop support for Nanos6-specific instrumentation; use ovni instead