Skip to content

OutOfMemoryError when >= 200 entities & using GPU #17

Description

@gintasz

I was investigating models to obtain rankings for pairwise comparison and wondered if this model would suit my needs.
In my project, I'd have < 10,000 entities that could be pairwise compared with each other.

I tried the current implementation on this repository and there is a memory allocation issue when the number of entities is larger (e.g. 200) and GPU is used. I don't have the theoretical background to evaluate the algorithm, but would this issue be related only to the implementation and not the algorithm, thus potentially resolvable, or not?

import ranking_models.asap.asap_gpu as asap_gpu
import numpy as np
N = 200
pwc_mat = np.random.randint(0, 100, size=(N, N))
pairs, scores_mean, scores_std = asap_gpu.ASAP(pwc_mat, mst_mode=True, cuda=True, get_scores = True)

print("Indeces from pwc_mat to compare:")
print(pairs)
print("Scores means \n",scores_mean)
print("Scores standard deviaion \n", scores_std)
Traceback (most recent call last):
  File "/root/test/test_asap_gpu.py", line 5, in <module>
    pairs, scores_mean, scores_std = asap_gpu.ASAP(pwc_mat, mst_mode=True, cuda=True, get_scores = True)
  File "/root/test/asap/asap_gpu.py", line 158, in ASAP
    G = torch.zeros(G0.size(0),I.size(0),G0.size(1)+1).to(G0)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 35.23 GiB (GPU 0; 23.65 GiB total capacity; 1.52 MiB already allocated; 23.14 GiB free; 2.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions