Skip to content

Exponential memory allocation caused by YAML parameter files with reused anchors #10177

@0x2b3bfa0

Description

@0x2b3bfa0

Bug Report

YAML parameter files with enough reused anchors will cause DVC to exhaust the system resources by exponentially appending duplicate elements to list objects.

Minimal reproducible example

$ python <<< "from dvc.repo import Repo; print(Repo().params.show())"

Note

If you prefer porcelain, feel free to run dvc params diff or call dvc.api.params_show() instead.

Sample project files

dvc.yaml

stages:
  example:
    foreach: [0, 1, 2, 3]
    do:
      cmd: test
      params:
      - custom.yaml: [references]
      - custom.yaml: [references]
      - custom.yaml: [references]
      - custom.yaml: [references]

custom.yaml

values: &anchor [0, 1, 2, 3]

references:
  first: *anchor
  second: *anchor
  third: *anchor
  fourth: *anchor

.dvc/config

[core]

Environment information

$ dvc doctor
DVC version: 3.33.4 (pip)
-------------------------
Platform: Python 3.10.8 on Linux-6.2.0-1018-azure-x86_64-with-glibc2.31
Subprojects:
        dvc_data = 2.24.0
        dvc_objects = 2.0.1
        dvc_render = 1.0.0
        dvc_task = 0.3.0
        scmrepo = 1.6.0
Supports:
        azure (adlfs = 2023.10.0, knack = 0.11.0, azure-identity = 1.15.0),
        gdrive (pydrive2 = 1.18.0),
        gs (gcsfs = 2023.12.2.post1),
        hdfs (fsspec = 2023.12.2, pyarrow = 14.0.1),
        http (aiohttp = 3.9.1, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.9.1, aiohttp-retry = 2.8.3),
        oss (ossfs = 2023.12.0),
        s3 (s3fs = 2023.12.2, boto3 = 1.33.13),
        ssh (sshfs = 2023.10.0),
        webdav (webdav4 = 0.9.8),
        webdavs (webdav4 = 0.9.8),
        webhdfs (fsspec = 2023.12.2)
Config:
        Global: /home/codespace/.config/dvc
        System: /etc/xdg/dvc
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: None
Workspace directory: ext4 on /dev/loop3
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/847c03a095ad0b42b23e20ceb6c386f0

Metadata

Metadata

Assignees

Labels

A: paramsRelated to dvc paramsbugDid we break something?

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions