Open
Description
As my research project, I am trying to use OpenKE for loading Wikidata truthy NT file
However, when I reach the following step in openke/config/Trainer.py
file
if self.use_gpu:
self.model.cuda()
I get the CUDA out of memory error.
File "/home/myname/OpenKE/openke/config/Trainer.py", line 58, in run
self.model.cuda()
File "/var/scratch/anaconda3/envs/py3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 749, in cuda
return self._apply(lambda t: t.cuda(device))
File "/var/scratch/anaconda3/envs/py3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 641, in _apply
module._apply(fn)
File "/var/scratch/anaconda3/envs/py3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 641, in _apply
module._apply(fn)
File "/var/scratch/anaconda3/envs/py3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 664, in _apply
param_applied = fn(param)
File "/var/scratch/anaconda3/envs/py3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 749, in <lambda>
return self._apply(lambda t: t.cuda(device))
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1614.71 GiB (GPU 0; 47.54 GiB total capacity; 0 bytes already allocated; 47.17 GiB free; 0 bytes reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
My question is: Has someone tried training embeddings for huge datasets like Wikidata ?
Any pointers would be appreciated
For the above dataset, there were
7794277662 Number of Triples (7 billion)
6235422129 (80% training triples = 6 billion)
Metadata
Metadata
Assignees
Labels
No labels