Skip to content

Fix memory leak of h5py when accessing dataset with composite dtypes. #92

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 10, 2023

Conversation

johannesschabbauer
Copy link

I had a memory leak in a device Worker, caused by the properties.get function (connection_table_properties) in this line:

namecol_dtype = dataset['name'].dtype

Every time the dtype attribute of the "connection table" dataset is called, some kB of memory are used that are never freed after that. In the Worker process the memory usage was increasing for every shot (I called the function in get_final values 6 times for different child devices), which is a severe problem if 10,000s of shots are run without restarts.

I resolved the problem for me by not using properties.get. There are already reports of some related issues.

The memory leak seems to be caused by the mixed usage of fixed length and variable length columns, and can be reproduced with the following code:

import h5py
import numpy as np

#create test file
f = h5py.File("test.h5","x") 
# create dataset with fixed and vlen dtype
dtype = [("name","S256"),("string",h5py.special_dtype(vlen=str))]
data = np.array(10*[("test","test")], dtype=dtype)
f.create_dataset("test", data=data)

for _ in range(10000):
    f["test"]["name"].dtype
    # f["test"]["name"][0] also causes a memory leak

f.close()

With the changes in the PR I don't see any memory leak. Accessing the data first by column and then by row also causes a memory leak. Thus it could be possible that a similar problem occurs in some other functions, too.

@dihm
Copy link
Contributor

dihm commented Mar 10, 2023

Good find Johannes! This is rather annoying little feature. Since the change is small and entirely equivalent in function, I'm going to go ahead and merge.

@dihm dihm merged commit c38b661 into labscript-suite:master Mar 10, 2023
dihm added a commit that referenced this pull request Mar 28, 2023
Fix memory leak of h5py when accessing dataset with composite dtypes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants