Description
Code Sample, a copy-pastable example if possible
import numpy as np
import pandas as pd
import gc
import os
import psutil
def get_process_memory():
return round(psutil.Process(os.getpid()).memory_info().rss / float(2 ** 20), 2)
test_dict = {}
for i in range(0, 50):
test_dict[i] = np.empty(10)
dfs = []
for i in range(0, 1000):
df = pd.DataFrame(test_dict)
dfs.append(df)
gc.collect()
# before
print('memory usage (before "memory_usage"):\t{} MB'.format(get_process_memory()))
for df in dfs:
df.memory_usage(index=True, deep=True)
gc.collect()
# after
print('memory usage (after "memory_usage"):\t{} MB'.format(get_process_memory()))
Problem description
Dataframe's memory_usage function has memory leak. Memory usage after executing 'memory_usage' function should be the same as before.
Expected Output
None
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 2.7.16.final.0
python-bits: 64
OS: Darwin
OS-release: 19.0.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: zh_CN.UTF-8
LOCALE: None.None
pandas: 0.24.2
pytest: None
pip: 19.3.1
setuptools: 19.6.1
Cython: 0.29.13
numpy: 1.16.5
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.8.1
pytz: 2019.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None