This project uses semantic versioning. This change log uses principles from keep a changelog.
- Fixed defect where "frozen_at" administrative metadata changed when a dataset was being copied (in the destination dataset). Many thanks to Johannes L. Hörmann and Lars Pastewka for bug reports, design discussions and code contributions. See: jic-dtool/dtoolcore#20
- Improve handling of Windows paths with drive letters where the dataset is located in a drive different to that of the working directory, see jic-dtool/dtoolcore#23
- License files now included in releases thanks to Jan Janssen (https://github.com/jan-janssen)
dtoolcore.iter_datasets_in_base_uri
helper functiondtoolcore.iter_proto_datasets_in_base_uri
helper function
- Fixed defect in
dtool readme interactive
command when the readme template contains a date. Thanks to Lars Pastewka. - Fixed defect in "dtool readme interaction" when the default date of today is not updated when using "{{ date }}" in the readme template. See jic-dtool/dtool-create#24 Thanks to Antoine Sanner.
- Fixed issue where "dtool readme edit" opened file with ".txt" extension rather than ".yml" extension. See: jic-dtool/dtool-cli#3 Thanks to Antoine Sanner.
Added support for tags from the dtool CLI.
- The CLI command 'dtool tag set'
- The CLI command 'dtool tag ls'
- The CLI command 'dtool tag delete'
Added Python API support for tags.
- Added
dtoolcore._BaseDataSet.put_tag()
method - Added
dtoolcore._BaseDataSet.delete_tag()
method - Added
dtoolcore._BaseDataSet.list_tags()
method - Added
dtoolcore.storagebroker.BaseStorageBroker.delete_key()
method - Added
dtoolcore.storagebroker.BaseStorageBroker.get_tag_key()
method - Added
dtoolcore.storagebroker.BaseStorageBroker.list_tags()
method - Added
dtoolcore.storagebroker.BaseStorageBroker.put_tag()
method - Added
dtoolcore.storagebroker.BaseStorageBroker.delete_tag()
method - Added
dtoolcore.storagebroker.DiskStorageBroker.delete_key()
method - Added
dtoolcore.storagebroker.DiskStorageBroker.get_tag_key()
method - Added
dtoolcore.storagebroker.DiskStorageBroker.list_tags()
method - Default cache directory changed from
~/.cache/dtool/http
to~/.cache/dtool
- Cache environment variable changed from DTOOL_HTTP_CACHE_DIRECTORY to DTOOL_CACHE_DIRECTORY
- Add
dtool readme validate
command - Ability to update descriptive metadata in README of frozen datasets
when using
dtool redme write
- Fixed several defects in how URIs were parsed and generated on Windows.
Improved Python API for creating datasets.
- dtoolcore.create_proto_dataset() helper function
- dtoolcore.create_derived_proto_dataset() helper function
- dtoolcore.DataSetCreator helper context manager class
- dtoolcore.DerivedDataSetCreator helper context manager class
- Fixed defect where using
DTOOL_NUM_PROCESSES
> 1 resulted in a cPickle.PicklingError on some storage brokers. Multiprocessing is now only used if the storage broker supports it.
- Fixed defect where 'dtool verify' calculated hashes even when the '-f/--full' option was not specified. The 'dtool verify' command now runs more quickly.
- Ability to use multiple processes (cores) to generate item properties for
manifest files in parallel. Set the environment variable
DTOOL_NUM_PROCESSES
to specify the number of processes to use.
- Included .dtool/annotations directory in DiskStorageBroker self description file
New feature: Dataset annotation
Dataset annotations are intended to make it easy to add and access specific metadata at a per dataset level.
The difference between annotations and the descriptive metadata is that the former is easier to work with in a programmatic fashion. The descriptive metadata, stored in the dataset's README content, is more free form. It is non-trivial to access specific pieces of information from the descriptive metadata in the dataset's README content, whereas a dtool annotation can be easily accessed by its name.
- Added
dtool annotation set
command - Added
dtool annotation get
command - Added
dtool annotation ls
command
- Added sorting of items by relpath to 'dtool ls <DS_URI>'
- Fixed formatting of 'dtool ls <DS_URI>' from using two whitespaces to using
one tab to make it easier to work with command line tools such as
cut
- Fixed ordering of lines in overlay CSV template from being sorted by the identifier to being ordered by the relpath
- Added 'dtool overlays show' command
- Added 'dtool overlays write' command
- Added 'dtool overlays template parse' command
- Added 'dtool overlays template glob' command
- Added 'dtool overlays template pairs' command
- Deprecated 'dtool overlay ls'
- Deprecated 'dtool overlay show'
- Added support for host name in file URI.
- Added
dtool status
command for working out if a dataset is frozen or not - Added
dtool uri
command for expanding absolute and relative paths into proper URIs
- Added more debug logging
- Added
dtool config ecs ls
command to list ECS base URIs that have been - Added support for configuring access to ECS buckets in multiple namespaces
- The
dtool config azure ls
command now returns base URIs rather than container names
dtool config readme-template
CLI command for configuring the path to a custom readme templatedtoolcore._BaseDataSet.base_uri
propertydtoolcore.storagebroker.BaseStorageBroker.generate_base_uri
methoddtoolcore.utils.DEFAULT_CACHE_PATH
global helper variabledtoolcore.utils.get_config_value_from_file
helper functiondtoolcore.utils.write_config_value_to_file
helper function
dtool config cache
now works with one unified cache directory for all storage brokers- Started using unified environment variable to specify the cache directory
DTOOL_CACHE_DIRECTORY
- Default cache directory changed set to
~/.cache/dtool
- Fixed defect when username was supplied as two separate strings to
dtool config user name
in CLI
- Fixed the
dtool config azure set
help text
- Added
dtool publish
command - Added
-f/--format
option todtool summary
command to enable output in JSON format - Added sorting of CSV/TSV/HTML inventories by dataset name
- Changed default output of
dtool summary
to be human readable YAML
- Added support for Windows! :)
- Added
dtool config
command
- Added
dtool uuid
command - Added
dtool item relpath
command
dtool cp
to replacedtool copy
dtool readme write
to write readme from file or stdindtool item overlay
command
dtool copy
in favour ofdtool cp
- Removed
created_at
field from default README template
- Defect in
dtool create
when providing a relative path to the--symlink-path
option - Python 2 defect in dealing with unicode in README.yml file when using
dtool readme edit
dtoolcore.filehasher.hashsum_digest
helper functiondtoolcore.filehasher.md5sum_digest
helper function
- Improved name from
dtoolcore.filehasher.hashsum
todtoolcore.filehasher.hashsum_hexdigest
- Deal with issue in how ruamel.yaml deals with float values
- Added ability to update the name of a frozen dataset from the
dtool
CLI - Added
update_name
method toDataSet
class (previously only available onProtoDataSet
class)
Dataset name validation.
dtoolcore.generate_admin_metadata
function raisesdtoolcore.DtoolCoreInvalidNameError
if invalid name is provideddtoolcore.utils.name_is_valid
utility function for checking sanity of dataset names- Validation of dataset name upon creation using dtool CLI
- Validation of dataset name when updating it using dtool CLI
- Fixed defect where
dtool ls -q
was listing dataset names rather than URIs making it impossible to process datasets in a BASE_URI programatically - Make
SymlinkStorageBroker
compatible with dtoolcore 3.4.0
Storage broker base class redesign and refactoring.
- Ability to update descriptive metadata in README of frozen datasets
- Validation that the descriptive metadata provided by the
dtool readme edit
command is valid YAML - Added
dtoolcore.storagebroker.BaseStorageBroker
- Added logging to the reusable
BaseStorageBroker
methods get_text
new method onBaseStorageBroker
classput_text
new method onBaseStorageBroker
classget_admin_metadata_key
new method onBaseStorageBroker
classget_readme_key
new method onBaseStorageBroker
classget_manifest_key
new method onBaseStorageBroker
classget_overlay_key
new method onBaseStorageBroker
classget_structure_key
new method onBaseStorageBroker
classget_dtool_readme_key
new method onBaseStorageBroker
classget_size_in_bytes
new method onBaseStorageBroker
classget_utc_timestamp
new method onBaseStorageBroker
classget_hash
new method onBaseStorageBroker
classget_relpath
new method onBaseStorageBroker
classupdate_readme
new method onBaseStorageBroker
classDataSet.put_readme
method that can be used to update descriptive metadata- in (frozen) dataset README whilst keeping a copy of the historical README content
- Add
storage_broker_version
key to structure parameters
- Stop
copy_resume
function calculating hashes unnecessarily - Fixed the documentation of the
dtool verify
command
- Default config file now set in
dtoolcore.utils.get_config_value
if not provided in caller
- Made download to DTOOL_HTTP_CACHE_DIRECTORY more robust
- Added ability to deal with redirects to enable working with shortened URLs
- Bundling of
dtool-http
package
- Bundling of
dtool-irods
package - Bundling of
dtool-s3
package
- Pre-checks to 'dtool freeze' command to ensure that there is no rogue content in the base of disk datasets
- Added rogue content validation check to DiskStorageBroker.pre_freeze hook
- Pre-checks to 'dtool freeze' command to ensure that the item handles are sane, i.e. that they do not contain newline characters
- Pre-checks to 'dtool freeze' command to ensure that there are not too many items in the proto dataset, default to less than 10000
- Defect where inventory html template is not included in Python package on PyPi
- Add "created_at" key to the administrative metadata
dtool inventory
command for generating csv/tsv/html inventories of collections of datasets- Added support for
-h
flag as well as--help
- Added timestamp to logging output
- Improved handling of URIs in validation code
- Fixed defect where running
dtool item properties
with an invalid identifier resulted in a KeyError exception being propagated to the user - Fixed defect where
dtool verify
did not compare file sizes - Fixed timestamp defect in DiskStoragBroker
- Fixed issue arising from a file being put into iRODS and the connection breaking before the appropriate metadata could be set on the file in iRODS. See also: jic-dtool/dtool-irods#7
Release to make it easier to create symlink datasets in an automated fashion.
- Simplified the way to specify the symbolic link path in the SymLinkStorageBroker
- The path to the data when creating a symlink dataset is now specified using the
-s/--symlink-path
option rather than being something that is prompted for. This makes it easier to create symlink datasets in an automated fashion.
--resume
option todtool copy
command--quite
and--verbose
options todtool ls
and improved formatting- Add
dtoolcore.copy_resume
function
This release makes use of the dtoolcore version 3.0.0 API, which improves the handling of URIs and adds more metadata describing the structure of datasets.
Another major feature of this release is the addition of an S3 storage broker that can be used to interact with Amazon's S3 object storage.
- AWS S3 object storage broker
- Writing of
.dtool/structure.json
file to the DiskStorageBroker; a file for describing the structure of the dtool dataset in a computer readable format - Writing of
.dtool/README.txt
file to the DiskStorageBroker; a file for describing the structure of the dtool dataset in a human readable format - Writing of
.dtool/structure.json
file to the IrodsStorageBroker; a file for describing the structure of the dtool dataset in a computer readable format - Writing of
.dtool/README.txt
file to the IrodsStorageBroker; a file for describing the structure of the dtool dataset in a human readable format
- Make use of dtoolcore version 3 API
- Removed the historical
dtool_readme
key/value pair from the administrative metadata (in the .dtool/dtool file)
- Ability to specify a custom README.yml template file path.
- Ability to configure the full user name for the README.yml template using
DTOOL_USER_FULL_NAME
- Made
.dtool/manifest.json
content created by DiskStorageBroker human readable by adding new lines and indentation to the JSON formatting. - Made the DiskStorageBroker.list_overlay_names method more robust. It no
longer falls over if the
.dtool/overlays
directory has been lost, i.e. by cloning a dataset with no overlays from a Git repository. - Fixed defect where an incorrect URI would get set on the dataset when using
DataSet.from_path
class method on a relative path - Made the YAML output more pretty by adding more indentation.
- Replaced hardcoded
nbi.ac.uk
email with configurableDTOOL_USER_EMAIL
in the default README.yml template. - Fixed
IrodsStorageBroker.generate_uri
class method - Made
.dtool/manifest.json
content created by IrodsStorageBroker human readable by adding new lines and indentation to the JSON formatting. - Added rule to catch
CAT_INVALID_USER
string for giving a more informative error message when iRODS authentication times out
- Fixed issue where the symbolic link was not fully resolved when creating a symlink dataset that used the terminal to prompt for the data directory
- More graceful exit if one presses Cancel in file browser when creating a symlink dataset
- Data directory now falls back on click command line prompt if TkInter has issues when creating a symlink dataset
pre_freeze_hoook
to the stroage broker interface called at the beginning ofProtoDataSet.freeze
method.--quiet
flag todtool create
commanddtool overlay ls
command to list the overlays in datasetdtool overlay show
command to show the content of a specific overlay
- Improved speed of freezing a dataset in iRODS by making use of caches to reduce the number of calls made to iRODS during this process
dtool copy
now specifies target location using URI rather than using the--prefix
and--storage
arguments
- Made the
DiskStorageBroker.create_structure
method more robust - More informative error message when iRODS has not been configured
- More informative error message when iRODS authentication times out
- Stopped client hanging when iRODS authentication has timed out
- storagebroker's
put_item
method now returns relpath - Made the
IrodsStorageBroker.create_structure
method more robust by checking if the parent collection exists - Made error handling in
dtool create
more specific - Added propagation of original error message when
StorageBrokerOSError
captures indtool create
dtool ls
can now be used to list the relpaths of the items in a dataset-f/--full
flag todtool diff
command to include checking of file hashes-f/--full
flag todtool verify
command to include checking of file hashes
dtool ls
now works with URIs rather than with prefix and storage argumentsdtool diff
now only compares identifiers and file sizes by defaultdtool verify
now only compares identifiers and file sizes by default
- Made
DiskStorageBroker.list_dataset_uris
class method more robust
- Set the correct dependency to actually get fix reported in 2.1.1
- Fixed defect in iRODS storage broker where files with white space resulted in broken identifiers
dtool readme show
command that returns the readme content--quiet
flag todtool copy
command
- Improved the
dtool readme --help
output
- Progress bar now shows information on individual items being processed
dtool ls
now works with relative paths- Fix defect where
IrodsStorageBroker.put_item
raised SystemError when trying to overwrite an existing file
- Better validation of input in terms of base vs proto vs frozen dataset URIs
- Fixed bug where copy creates an intermediate proto dataset that self identifies as a frozen dataset.
- Fixed potential bug where a copy could convert a proto dataset to a dataset before all its overlays had been copied over
- Fixed type of "frozen_at" time stamp in admin metadata: from string to float
- Made version requirements of dtool sub-packages explicit
Initial release of dtool
as a meta package.