Skip to content

Is multi-processing supported? #35

@semaphore-egg

Description

@semaphore-egg

Thank you guys for this amazing beautiful cool tool!

Feature Request

I am dealing with some memory problems related to pytorch dataloader for several days. And just tried memray with a simple script below. I found that in the live mode, the information of main process is reported but all processes are detected as threads and no information is reported.

from torch.utils.data import Dataset, DataLoader
import numpy as np
import torch
import sys

class DataIter(Dataset):
    def __init__(self):
        n = int(2.4e7)
        self.data = [x for x in range(n)]

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        data = self.data[idx]
        data = np.array([data], dtype=np.int64)
        return torch.tensor(data)


train_data = DataIter()
train_loader = DataLoader(train_data, batch_size=300,
                          shuffle=True,
                          drop_last=True,
                          pin_memory=False,
                          num_workers=12)

for i, item in enumerate(train_loader):
    if i % 1000 == 0:
        print(i, end='\t', flush=True)

Screenshot of main process:
Screenshot from 2022-04-21 22-56-22

screen shot of other process:

Screenshot from 2022-04-21 22-50-01

The following command memray run --live simple_multi_worker.py is used.

Is there a way to observe multi-processing information?

Metadata

Metadata

Assignees

Labels

questionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions