Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Python: OOM-kill when trying to fetch obsm_layers #3064

Closed
ivirshup opened this issue Sep 24, 2024 · 3 comments
Closed

[Bug] Python: OOM-kill when trying to fetch obsm_layers #3064

ivirshup opened this issue Sep 24, 2024 · 3 comments
Assignees

Comments

@ivirshup
Copy link
Contributor

ivirshup commented Sep 24, 2024

Describe the bug

The code in the example below dies with Killed on tiledbsoma 1.14.1, but not on tiledbsoma 1.13

To Reproduce

import tiledbsoma
import cellxgene_census

census = cellxgene_census.open_soma(census_version="2023-12-15")
soma = census["census_data"]["homo_sapiens"]
(
    soma
    .axis_query(
        measurement_name="RNA",
        obs_query=tiledbsoma.AxisQuery(coords=(slice(100),)),
    )
    .to_anndata(
        X_name="raw",
        obsm_layers=["scvi"],
    )
)

You can see more examples on the cellxgene_census CI

Versions (please complete the following information):

  • TileDB-SOMA version: 1.14.1
  • Language and language version (e.g. Python 3.8, R 4.2.2): python 3.11
  • OS (e.g. MacOS, Ubuntu Linux):
  • Note: you can use tiledbsoma.show_package_versions() (Python) or tiledbsoma::show_package_versions() (R)
tiledbsoma.__version__              1.14.1
TileDB core version (libtiledbsoma) 2.26.1
python version                      3.11.9.final.0
OS version                          Linux 6.8.0-1016-aws

Additional context

Upstream of: chanzuckerberg/cellxgene-census#1288

@johnkerl johnkerl self-assigned this Sep 24, 2024
@johnkerl johnkerl changed the title [Bug] python, segfault occurs when trying to fetch obsm_layers [Bug] Python: segfault when trying to fetch obsm_layers Sep 24, 2024
@johnkerl johnkerl changed the title [Bug] Python: segfault when trying to fetch obsm_layers [Bug] Python: apparent OOM-kill when trying to fetch obsm_layers Sep 25, 2024
@johnkerl
Copy link
Member

@nguyenv running @ivirshup 's repro with

export SPDLOG_LEVEL=tiledbsoma=trace

I have

...
[2024-09-25 00:44:07.047] [tiledbsoma] [Process: 68934] [Thread: 69912] [debug] [ManagedQuery] submit thread start
[2024-09-25 00:44:09.902] [Process: 68934] [debug] [1727224968354718411-Context: 1] [Query: 10] Done processing tiles
[2024-09-25 00:44:09.903] [Process: 68934] [debug] [1727224968354718411-Context: 1] [Query: 10] Done with iteration, num result tiles 0
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69912] [debug] [ManagedQuery] submit thread done
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [debug] [ManagedQuery] [scvi] Done waiting for query
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [debug] [ManagedQuery] [scvi] Buffer soma_dim_0 cells=134217728
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [debug] [ManagedQuery] [scvi] Buffer soma_dim_1 cells=134217728
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [debug] [ManagedQuery] [scvi] Buffer soma_data cells=134217728
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [trace] [ArrowAdapter] column type l name soma_dim_0 nbuf 2 2 nullable false
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [trace] [ArrowAdapter] create array name='soma_dim_0' use_count=4
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [debug] [ArrowAdapter] release_schema for soma_dim_0
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [trace] [ArrowAdapter] release_schema schema->name
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [trace] [ArrowAdapter] release_schema schema->format
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [trace] [ArrowAdapter] release_schema done
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [trace] [ArrowAdapter] column type l name soma_dim_1 nbuf 2 2 nullable false
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [trace] [ArrowAdapter] create array name='soma_dim_1' use_count=4
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [debug] [ArrowAdapter] release_schema for soma_dim_1
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [trace] [ArrowAdapter] release_schema schema->name
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [trace] [ArrowAdapter] release_schema schema->format
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [trace] [ArrowAdapter] release_schema done
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [trace] [ArrowAdapter] column type f name soma_data nbuf 2 2 nullable false
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [trace] [ArrowAdapter] create array name='soma_data' use_count=4
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [debug] [ArrowAdapter] release_schema for soma_data
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [trace] [ArrowAdapter] release_schema schema->name
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [trace] [ArrowAdapter] release_schema schema->format
[2024-09-25 00:44:09.903] [tiledbsoma] [Process: 68934] [Thread: 69528] [trace] [ArrowAdapter] release_schema done
[2024-09-25 00:44:09.904] [tiledbsoma] [Process: 68934] [Thread: 69528] [trace] [ManagedQuery] allocate new buffers
[2024-09-25 00:44:09.904] [tiledbsoma] [Process: 68934] [Thread: 69528] [debug] [ManagedQuery] [scvi] Adding buffer for column 'soma_dim_0'
[2024-09-25 00:44:09.904] [tiledbsoma] [Process: 68934] [Thread: 69528] [debug] [ColumnBuffer] 'soma_dim_0' 1073741824 bytes is_var=false is_nullable=false
[2024-09-25 00:44:09.904] [tiledbsoma] [Process: 68934] [Thread: 69528] [debug] [ManagedQuery] [scvi] Adding buffer for column 'soma_dim_1'
[2024-09-25 00:44:09.904] [tiledbsoma] [Process: 68934] [Thread: 69528] [debug] [ColumnBuffer] 'soma_dim_1' 1073741824 bytes is_var=false is_nullable=false
[2024-09-25 00:44:09.904] [tiledbsoma] [Process: 68934] [Thread: 69528] [debug] [ManagedQuery] [scvi] Adding buffer for column 'soma_data'
[2024-09-25 00:44:09.904] [tiledbsoma] [Process: 68934] [Thread: 69528] [debug] [ColumnBuffer] 'soma_data' 1073741824 bytes is_var=false is_nullable=false
[2024-09-25 00:44:09.904] [tiledbsoma] [Process: 68934] [Thread: 69528] [debug] [ManagedQuery] [scvi] Waiting for query
[2024-09-25 00:44:09.904] [tiledbsoma] [Process: 68934] [Thread: 69913] [debug] [ManagedQuery] submit thread start
Killed

@johnkerl
Copy link
Member

@nguyenv this is as far as I got debugging:

  • Repro with Python 3.11
  • MacOS or Linux
  • 16GB RAM; 64GB RAM

I'm guessing either a memory leak or a pagination miswiring ...

@johnkerl johnkerl changed the title [Bug] Python: apparent OOM-kill when trying to fetch obsm_layers [Bug] Python: OOM-kill when trying to fetch obsm_layers Sep 26, 2024
@johnkerl
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants