Training on Wikidata (huge dataset) using OpenKE

As my research project, I am trying to use OpenKE for loading Wikidata [truthy NT file](https://dumps.wikimedia.org/wikidatawiki/entities/)

However, when I reach the following step in `openke/config/Trainer.py` file
```
        if self.use_gpu:
            self.model.cuda()
```
I get the CUDA out of memory error.

```
  File "/home/myname/OpenKE/openke/config/Trainer.py", line 58, in run
    self.model.cuda()
  File "/var/scratch/anaconda3/envs/py3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 749, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/var/scratch/anaconda3/envs/py3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 641, in _apply
    module._apply(fn)
  File "/var/scratch/anaconda3/envs/py3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 641, in _apply
    module._apply(fn)
  File "/var/scratch/anaconda3/envs/py3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 664, in _apply
    param_applied = fn(param)
  File "/var/scratch/anaconda3/envs/py3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 749, in <lambda>
    return self._apply(lambda t: t.cuda(device))
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1614.71 GiB (GPU 0; 47.54 GiB total capacity; 0 bytes already allocated; 47.17 GiB free; 0 bytes reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
```

**My question is: Has someone tried training embeddings for huge datasets like Wikidata ?**
Any pointers would be appreciated 


For the above dataset, there were
7794277662 Number of Triples (7 billion)
6235422129 (80% training triples = 6 billion) 
 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training on Wikidata (huge dataset) using OpenKE #406

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Training on Wikidata (huge dataset) using OpenKE #406

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions