Skip to content

TestArchivalFromNonArchival random failure case: memory? #3334

@algonautshant

Description

@algonautshant

TestArchivalFromNonArchival is randomly failing.
The cause of the failure is the addBlock command getting extremely slow, sometimes minutes.

The ledger database is on disk, but it is unlikely to have too much disk activity, since the tests do not run simultaniously.
However, investigation shows that this may happen when the system runs out of memory.

The memory usage of TestArchivalFromNonArchival is growing, and is not freed after the test ends.
To demonstrate this:
set maxBlocks = 50000 also create 500 accounts (instead of 50)
clone the test to TestArchivalFromNonArchival2
run both tests: go test -run TestArchivalFromNonArchival

Since the two tests do not run in parallel, the memory consumption should go up, then down, then up again.
However, this does not happen. The memory consumption keeps going even after one of the two tests finishes.

The memory profiler indicates cloneAssetParams as the major memory user:

      flat  flat%   sum%        cum   cum%
   15.18GB 61.45% 61.45%    15.18GB 61.45%  github.com/algorand/go-algorand/ledger/apply.cloneAssetParams
    3.06GB 12.39% 73.84%     3.06GB 12.39%  github.com/algorand/go-algorand/ledger/apply.cloneAssetHoldings

Metadata

Metadata

Assignees

No one assigned

    Labels

    Team Carbon-11tech debtThings that need re-work for simplification / sanitization to reduce implementation overhead

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions