You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've got a python script running in docker that loads data from sql to a panda dataframe, depending on the volume of data it can load between 15gb and 60gb of data in memory.
This issue is not related to docker memory limits, the script is monitored and fails well below docker container memory limit.
The issue I get is complicated. It mainly fails silently, I need to open dmesg to see some segfault happening.
It seems to happen when data has finished downloading from database
This issue is docker specific, when I run the script on my dev machine (withou docker), everything is going well.
It seems to me there is 2 cases:
First case: my python script use 15gb memory
When data transfer is finished, the python script fails silently and triggers a segfault:
[17623143.600654] python[696237]: segfault at 0 ip 00007f746595680a sp 00007ffc7119fe60 error 6 in connectorx.cpython-38-x86_64-linux-gnu.so[7f746555d000+201c000]
[17623143.606349] Code: 41 56 53 48 83 ec 18 49 89 fe 66 48 8d 3d 9e df 68 02 66 66 48 e8 e6 6e c0 ff 48 83 38 00 74 18 48 83 c0 08 48 83 38 00 74 2e <49> 83 06 ff 74 77 48 83 c4 18 5b 41 5e c3 66 48 8d 3d 70 df 68 02
My process then restarts (i've got a restart policy for my failed containers) and in this case, the process does not fail (this happened each times).
Second case: my script use 30gb memory
When data transfer is finished, the python script fails with the error "corrupted double-linked list (not small)" and seems to triggers the same kind of segfault:
[17623143.600654] python[696237]: segfault at 0 ip 00007f746595680a sp 00007ffc7119fe60 error 6 in connectorx.cpython-38-x86_64-linux-gnu.so[7f746555d000+201c000]
[17623143.606349] Code: 41 56 53 48 83 ec 18 49 89 fe 66 48 8d 3d 9e df 68 02 66 66 48 e8 e6 6e c0 ff 48 83 38 00 74 18 48 83 c0 08 48 83 38 00 74 2e <49> 83 06 ff 74 77 48 83 c4 18 5b 41 5e c3 66 48 8d 3d 70 df 68 02
My process then restarts (i've got a restart policy for my failed containers) then the process stil fails.
What are the steps to reproduce the behavior?
I cannot reproduce this on my local machine because my local docker instance hits container memory limit and container is shut down.
Host is running on debian 11 (128gb ram/24cores)
Our containers are using latest python 3.8.15 built using pyenv with these specific build flags: RUN CONFIGURE_OPTS="--enable-shared" PYTHON_CFLAG="-march=haswell -O3 -pipe" pyenv install ${PYTHON_VERSION}
Database setup if the error only happens on specific data or data type
Example query / code
This scripts and query should generate the same kind of data we are using with a high enough volume on a docker container:
test_bug.py
import connectorx as cx
import time
import os
query="""
WITH time as (SELECT generate_series('2022-01-01', '2022-08-01', '1 second'::interval) as timestamp_client),
ids AS (SELECT cid from (values('B45668C2-BFDC-4861-A38D-6141933F6940'),('40ABA32A-24EE-4876-8568-8E8E51D1D942'),('1837CE67-4BCC-4936-BC6D-76874BE1C4FF'),('D9194F09-7122-4EC7-AE81-FCB13A06B4EA'), ('645641C3-4E84-4475-AF85-1DFFBFE18726')) AS x(cid))
SELECT
cid,
timestamp_client,
500*random() as accuracy,
'ios' as os,
'UTC' as timezone,
40562 as place_id,
random() as confidence,
500*random() as distance
FROM ids
JOIN time ON True
ORDER BY timestamp_client ASC
"""
print(time.ctime())
print(os.environ.get('PG_CONN_URL'))
print(query)
result=cx.read_sql(os.environ.get('PG_CONN_URL'), query)
print(result.shape)
print(time.ctime())
Dockerfile
FROM python:3.8-bullseye
# install dependencies
RUN pip install connectorx pandas==1.3.5
RUN pip list
COPY ./test_bug.py /test_bug.py
ENTRYPOINT ["python", "/test_bug.py"]
Am not sure if you had time to test against differnt python versions, but we are experience a similar issue on python 3.9.14. Specifically we are doing a join on large datasets (>60gb).
Am not sure if you had time to test against differnt python versions, but we are experience a similar issue on python 3.9.14. Specifically we are doing a join on large datasets (>60gb).
I'm working on it, i'm building some docker images to test my code with different python version. I'll keep you updated when I've some news !
What language are you using?
Python
What version are you using?
0.3.0
What database are you using?
Postgresql
What dataframe are you using?
Pandas
Can you describe your bug?
I've got a python script running in docker that loads data from sql to a panda dataframe, depending on the volume of data it can load between 15gb and 60gb of data in memory.
This issue is not related to docker memory limits, the script is monitored and fails well below docker container memory limit.
The issue I get is complicated. It mainly fails silently, I need to open dmesg to see some segfault happening.
It seems to happen when data has finished downloading from database
This issue is docker specific, when I run the script on my dev machine (withou docker), everything is going well.
It seems to me there is 2 cases:
First case: my python script use 15gb memory
When data transfer is finished, the python script fails silently and triggers a segfault:
My process then restarts (i've got a restart policy for my failed containers) and in this case, the process does not fail (this happened each times).
Second case: my script use 30gb memory
When data transfer is finished, the python script fails with the error "corrupted double-linked list (not small)" and seems to triggers the same kind of segfault:
My process then restarts (i've got a restart policy for my failed containers) then the process stil fails.
What are the steps to reproduce the behavior?
I cannot reproduce this on my local machine because my local docker instance hits container memory limit and container is shut down.
Host is running on debian 11 (128gb ram/24cores)
Our containers are using latest python 3.8.15 built using pyenv with these specific build flags:
RUN CONFIGURE_OPTS="--enable-shared" PYTHON_CFLAG="-march=haswell -O3 -pipe" pyenv install ${PYTHON_VERSION}
Database setup if the error only happens on specific data or data type
Example query / code
This scripts and query should generate the same kind of data we are using with a high enough volume on a docker container:
test_bug.py
Dockerfile
docker build --pull -t connectorx-bug -f Dockerfile .
docker run --env PG_CONN_URL=your_db_conn_url test-dataprocessing-gps-visit
What is the error?
Segfault
And sometimes "corrupted double-linked list (not small)"
The text was updated successfully, but these errors were encountered: