This is pretty awesome, as-is. Thank you so much for this class.
However, I wonder if it would be possible to update this to use cudaMalloc() OR cudaMallocManaged(). Also, in the cudaMallocManaged case, one could choose cudaMemAttachGlobal or cudaMemAttachHost.
I guess if it could handle non-managed allocation, then that would go against the name of the class, though.