Description
This can be seen as revisiting feature request #1026
UPDATE: Please scroll down to #2458 (comment) for most recent, summarized requirement.
Here is the original context also (still relevant):
There's different scenarios in which being able to manipulate files granularly independently of how they were committed/pushed to DVC could be useful. The problem with using dvc add -R
now is that it can generate lots of .dvc
files, but what if a directory could be added without -R
(producing a single DVC-file) and yet other commands (lock, update, get, etc) could be applied to individual files inside the added directory tree?
Example (from iterative/dataset-registry@7476a85)
Project 1:
$ tree
.
└── tutorial
└── nlp
├── Posts.xml.zip
└── pipeline.zip
$ dvc add tutorial
...
$ dvc push
...
Project 2:
$ dvc import {project-1-url} tutorial/nlp/pipeline.zip
...
$ tree
.
├── tutorial
│ └── nlp
│ └── pipeline.zip
└── tutorial.dvc
Not sure about where the
.dvc
would have to be placed in this example though.
And also this is how Git works, I believe. Files are tracked individually (in fact it doesn't even recognize empty dirs).