Skip to content

Investigate performance of next-status with many subdatasets #606

Closed
@mih

Description

@mih

It takes twice as long as a plain listing of present submodules.

% datalad next-status
untracked: somelog
datalad next-status  117.60s user 62.30s system 63% cpu 4:41.11 total
% datalad subdatasets --state present
datalad subdatasets --state present  16.54s user 2.94s system 11% cpu 2:43.88 total

Sidenote: Listing absent submodules only is much faster.

% datalad -f json subdatasets --state absent | wc -l
42715
datalad -f json subdatasets --state absent  17.78s user 1.54s system 149% cpu 12.952 total

Timings above are not depending on a "cold start", but are reproducible on repeated runs (more or less).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions