Skip to content

FAISS memory error with UMAP (WSL2) #4290

Open
@alexwilson1

Description

@alexwilson1

Describe the bug

I have a dataset of 800k items (768 dim vectors). UMAP will work with the full 800k dataset, and with smaller (randomly sampled) datasets of around 150k, but medium-sized datasets of size ~300k, 350k etc crash with this error.

Traceback (most recent call last):
  File "/opt/project/callbacks.py", line 744, in on_click
    umap_data_3D = umap_for_clustering.fit_transform(embedding_matrix)
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/cuml/internals/api_decorators.py", line 549, in inner_set_get
    ret_val = func(*args, **kwargs)
  File "cuml/manifold/umap.pyx", line 659, in cuml.manifold.umap.UMAP.fit_transform
    
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/cuml/internals/api_decorators.py", line 409, in inner_with_setters
    return func(*args, **kwargs)
  File "cuml/manifold/umap.pyx", line 600, in cuml.manifold.umap.UMAP.fit
    
RuntimeError: Error in virtual void faiss::gpu::StandardGpuResourcesImpl::initializeForDevice(int) at /home/conda/feedstock_root/build_artifacts/faiss-split_1618468126454/work/faiss/gpu/StandardGpuResources.cpp:285: Error: 'err == cudaSuccess' failed: failed to cudaHostAlloc 268435456 bytes for CPU <-> GPU async copy buffer (error 2 out of memory)

I'm using a Titan RTX GPU with 24GB memory and nvidia-smi is showing more than enough free memory for this operation before applying fit_transform:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.10       Driver Version: 510.10       CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA TITAN RTX   WDDM  | 00000000:01:00.0 Off |                  N/A |
| 41%   29C    P8    10W / 280W |   3678MiB / 24576MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A         4      C   Insufficient Permissions        N/A      |
+-----------------------------------------------------------------------------+

This is using parameters (n_components=3, n_neighbors=15, min_dist=0.0) to create the UMAP model and fit_transform operation to apply it.

Using rapidsai/rapidsai:21.10-cuda11.2-base-ubuntu18.04-py3.8 with torch==1.9.1+cu111 applied on top of the environment.

Any idea why this works for the large dataset and not intermediate sized datasets please?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions