CRUD updates, dataset mutation and BlobTree API updates#38
Closed
CRUD updates, dataset mutation and BlobTree API updates#38
Conversation
144353b to
604dc8b
Compare
604dc8b to
85ee4c0
Compare
I had the TomlDataStorage struct inside DataTomlStorage.jl ??
85ee4c0 to
6bf32a2
Compare
* For data projects: - create() to create datasets - setindex!() to add existing datasets - delete() to delete datasets - Implementations for StackedDataProject, AbstractTOMLDataProject and TOMLDataProject * Concrete save_project() API to persist a DataProject to a file as TOML * For storage drivers: - AbstractDataDriver and implementation for FileSystemDriver - open_dataset to do what the current function-based API does - create_storage to initialize storage - delete_storage to remove storage - These ideas seem a bit half-baked * Refactoring open() to add write=true keyword
6bf32a2 to
a73f278
Compare
Member
|
I'll go ahead and close this PR, since I don't think we'll merge it. But the branch and discussion will stay around for future reference. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a big batch of changes, implementing
BlobTreeAPI to make it more coherent and simpleropen(write=true).DataProjectA lot of these changes are intertwined so I've put all this here as a draft, but I'll probably need to break this apart into separate PRs.
BlobTree
BlobTree now has a largely dictionary-like interface:
keys(tree)pairs(tree)haskey(tree, path)tree[path]newdir(tree, path),newfile(tree, path)delete!(tree, path)Where
pathis either a relative pathRelPathtype, or anAbstractString(in which case it'll be split on/to become a relative path).Unlike Dict, iteration of BlobTree currently iterates values (not key value pairs). This has some benefits - for example, broadcasting processing across files in a directory.
isdir(),isfile()- determine whether a child of tree is a directory or file.Example
You can create a new temporary BlobTree via the
newdir()function and fill it with combinations ofnewfile()ornewdir()You can also get access to a
BlobTreeby usingDataSets.from_path()with alocal directory name. For example:
AbstractDataProject interface additions
To support CRUD of datasets (#31) within data projects, the data driver interface needs much more flexibility. I've added:
DataSets.create()to create datasets — still needs some refinement, in particular the keyword parameters.Base.setindex!()to add a dataset to a projectDataSets.delete()to delete datasetsStackedDataProject,AbstractTOMLDataProjectandTOMLDataProjectRelatedly, I've added
DataSets.from_path()to create a standalone DataSet from data on the local filesystem, inferring the type as Blob or BlobTree. This can be passed as a source tocreate()to make a copy.Still TODO here is
DataSets.config(or some such) to update the metadata of a DataSet (alternatively — have the dataset know its owning data project and call back into that when it's updated?)Low level
AbstractDataDriverinterfaceThe low level driver interface is currently (in 0.2.6) just a function taking a user-defined callback.
However, to support CRUD operations for DataProject it needs to be expanded quite a bit. In particular to be able to create and delete storage in the storage backend. This PR adds
AbstractDataDriverand, so far a single implementationFileSystemDriverwith implementations ofThis interface is probably still a bit half-baked and needs some refinement.