Closed
Description
Description:
As our library grows and tackles more complex ML tasks, the need for leveraging the power of GPU becomes vital to ensure optimal performance and resource utilization. The goal of this ticket is to discuss and track the implementation of Nvidia CUDA support, enabling the library to conduct parallel computations on Nvidia GPUs, thus drastically enhancing data processing and training times for ML models.
Performance:
Implementing CUDA support should exhibit a noticeable performance improvement in model training and predictions, especially for computationally intensive tasks.
Tasks
- Implement Matrix Multiplications Feature that runs on GPU
Testing:
Adequate testing frameworks need to be in place to ensure stability and performance of CUDA implementation across different GPUs and operating systems.
Targeting CUDA/cudnn Version:
- CUDA Toolkit 11.8
- cuDNN 8.9.6