Skip to content

PathCollisionValidator is not very memory efficient #49

Open
@ghost

Description

PathCollisionValidator will choke quickly on a large repository, because revisionSnapshots stores a map with every full path in the repository for each revision. With many tags this becomes quite a lot quite fast, and Java runs out of memory :)

I solved that same issue (I'm doing something very similar to help me refactor a giant repository, but for work so I can't share code ;) ) by instead using a tree structure, with each node storing it's 'file' name and it's revision. The big trick is that for each new revision only the root node is replaced, the sub nodes are the nodes from the previous repository. On each add/delete/replace/modify the nodes in the path are also replaced, but no others. When adding a copy, again only the top node is new, and links to the nodes from the copied revision.
This creates a structure which holds every revision completely, in a easily traverse able format, so you can perform the checks, but which is far more memory use friendly.

Hope this makes sense, hope this helps ;)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions